Skip to content
Home ยป String Matching

String Matching


๐Ÿ“˜ String Matching (Pattern Matching)

1๏ธโƒฃ Introduction

String Matching is the problem of finding the occurrence(s) of a pattern string within a text string.

๐Ÿ”น Definition

Given:

  • A text string T of length n
  • A pattern string P of length m

Objective:

Find all positions where P occurs in T


2๏ธโƒฃ Applications of String Matching

  • Text editors (find/replace)
  • Search engines
  • DNA sequence analysis
  • Plagiarism detection
  • Compiler design

3๏ธโƒฃ Types of String Matching

  • Exact Matching โ†’ Pattern must match exactly
  • Approximate Matching โ†’ Allows mismatches

4๏ธโƒฃ Naive String Matching Algorithm

๐Ÿ”น Idea

Check the pattern at every possible position in the text.


๐Ÿ”น Algorithm (Naive Approach)

NaiveStringMatch(T, P):
   n = length(T)
   m = length(P)

   for i = 0 to n-m:
       j = 0
       while j < m and T[i+j] == P[j]:
           j++

       if j == m:
           print "Pattern found at index", i

๐Ÿ”น Example

Text:

T = "AABAACAADAABAABA"

Pattern:

P = "AABA"

Matches at indices:

0, 9, 12

๐Ÿ”น Time Complexity

CaseComplexity
Best CaseO(n)
Worst CaseO(nm)

5๏ธโƒฃ Efficient String Matching Algorithms


๐Ÿ”น 1. Knuth-Morris-Pratt (KMP) Algorithm

๐Ÿ”ธ Idea

  • Avoid unnecessary comparisons
  • Use Longest Prefix Suffix (LPS) array

๐Ÿ”ธ Steps

  1. Preprocess pattern โ†’ build LPS array
  2. Use LPS to skip comparisons

๐Ÿ”ธ Time Complexity

[
O(n + m)
]


๐Ÿ”น 2. Rabin-Karp Algorithm

๐Ÿ”ธ Idea

  • Use hashing to compare strings
  • Compare hash values instead of characters

๐Ÿ”ธ Steps

  1. Compute hash of pattern
  2. Compute hash of text substrings
  3. Compare hashes

๐Ÿ”ธ Time Complexity

  • Average: O(n + m)
  • Worst: O(nm) (due to collisions)

๐Ÿ”น 3. Boyer-Moore Algorithm

๐Ÿ”ธ Idea

  • Compare from right to left
  • Skip large portions using heuristics

๐Ÿ”ธ Techniques

  • Bad character rule
  • Good suffix rule

๐Ÿ”ธ Time Complexity

  • Best: O(n/m)
  • Worst: O(nm)

6๏ธโƒฃ Comparison of String Matching Algorithms

AlgorithmTime ComplexityTechnique
NaiveO(nm)Brute Force
KMPO(n + m)Prefix function
Rabin-KarpO(n + m) avgHashing
Boyer-MooreO(n/m) bestHeuristics

7๏ธโƒฃ Key Concepts

๐Ÿ”น Prefix

Beginning part of string
Example: “AB” is prefix of “ABCD”

๐Ÿ”น Suffix

Ending part of string
Example: “CD” is suffix of “ABCD”


8๏ธโƒฃ Advantages of Efficient Algorithms

โœ” Reduce redundant comparisons
โœ” Faster searching
โœ” Suitable for large texts


9๏ธโƒฃ Limitations

โœ– Complex implementation (KMP, BM)
โœ– Hash collisions (Rabin-Karp)


๐Ÿ”š Conclusion

String Matching is a fundamental problem in computer science, with multiple algorithms designed to improve efficiency over the naive approach.

Efficient algorithms like KMP and Rabin-Karp significantly reduce time complexity compared to brute force methods.


๐Ÿ“Œ Exam Tip

๐Ÿ‘‰ Always include:

  • Problem definition
  • Naive algorithm
  • One efficient algorithm (KMP preferred)
  • Time complexity comparison