Below is a clear, complete, exam-oriented explanation of the Boyer–Moore Algorithm, written like a DSA expert. This includes intuition, heuristics, complexity, and workflow.

📘 Boyer–Moore String Matching Algorithm

The Boyer–Moore algorithm is one of the fastest string-matching algorithms in practice. It works by comparing characters from right to left and uses two powerful heuristics to skip large sections of the text.

In many real-world cases (natural language text), Boyer–Moore runs in sublinear time, faster than KMP and Rabin–Karp.

📌 Key Ideas of Boyer–Moore

Pattern is matched from right to left.
When mismatch occurs, the pattern is shifted intelligently, not by just 1.
Two heuristics determine how far the pattern should be shifted:
- Bad Character Rule
- Good Suffix Rule

These allow big jumps → fewer comparisons → fast matching.

📘 Heuristic 1: Bad Character Rule

When mismatch occurs at pattern character P[j] and text character T[i]:

Shift pattern so that:
- The mismatched text character aligns with its last occurrence in the pattern
- OR skip past it if it doesn’t exist in the pattern

Example

Pattern: “ABCD”
Mismatch at ‘X’ (not in pattern)

Shift pattern entirely to the right because ‘X’ doesn’t exist.

📘 Heuristic 2: Good Suffix Rule

When mismatch happens after some suffix matched:

Shift pattern to align the next occurrence of that suffix in the pattern
OR
Shift so that the suffix aligns with a prefix of the pattern

This allows even larger skips.

📘 Why Boyer–Moore Is Fast

It skips many unnecessary character comparisons
In English text, average complexity becomes O(n / m)
Often faster than KMP and Rabin-Karp in practice
Most efficient for large alphabets (ASCII, Unicode, text documents)

📘 Time Complexity

Case	Time
Best Case	O(n/m) (almost constant time per shift)
Average Case	O(n)
Worst Case	O(n × m)

Worst case happens rarely (repetitive patterns like “AAAAAA”).

📘 Space Complexity

O(m) for storing bad character table
O(m) for good suffix table

📘 Boyer–Moore Matching Steps (High-Level)

Preprocess pattern to build:
- Bad character table
- Good suffix table
Align pattern to text
Compare pattern characters from right to left
On mismatch:
- Compute shift from bad character rule
- Compute shift from good suffix rule
- Choose the maximum shift
Repeat until pattern passes the text end
If match found, report position

📘 Simplified Python Implementation (Using Bad Character Rule Only)

(For exam demonstration. Full BM is longer.)

def bad_character_table(pattern):
    table = {}
    m = len(pattern)
    for i in range(m):
        table[pattern[i]] = i
    return table


def boyer_moore(text, pattern):
    n, m = len(text), len(pattern)
    if m == 0:
        return []

    bad_char = bad_character_table(pattern)
    positions = []

    shift = 0
    while shift <= n - m:
        j = m - 1

        # Compare pattern from right to left
        while j >= 0 and pattern[j] == text[shift + j]:
            j -= 1

        if j < 0:  # full match
            positions.append(shift)
            shift += m - bad_char.get(text[shift + m], -1) if shift + m < n else 1
        else:
            bad_char_shift = j - bad_char.get(text[shift + j], -1)
            shift += max(1, bad_char_shift)

    return positions


# Example usage
text = "ABAAABCD"
pattern = "ABC"
print("Pattern found at positions:", boyer_moore(text, pattern))

📘 Output

Pattern found at positions: [4]

📘 Comparison with Other Algorithms

Algorithm	Best	Average	Worst	Notes
Brute Force	O(n)	O(nm)	O(nm)	Very slow
KMP	O(n + m)	O(n + m)	O(n + m)	No rechecking
Rabin–Karp	O(n + m)	O(n + m)	O(nm)	Uses hashing
Boyer–Moore	O(n/m)	O(n)	O(nm)	Fastest in practice

📘 Summary (Exam Notes)

Boyer–Moore compares pattern from right to left.
Uses Bad Character and Good Suffix heuristics.
Gives sublinear running time in many cases.
Best-case performance: O(n / m) — extremely fast.
Worst-case: O(n × m) but rare.
Popular in text editors, document search, grep command, etc.