⭐ String — Detailed Explanation (Data Structure & Theory)

A String is a sequence of characters stored in contiguous memory.
Characters may include:

Letters (A–Z, a–z)
Digits (0–9)
Symbols (!,@,#,$, etc.)
Spaces, punctuation
Unicode characters (emoji, regional languages)

Strings are one of the most fundamental data structures in programming and computer science.

⭐ 1. Definition

A string is a one-dimensional array of characters terminated by a special character ('\0' in C) or managed by length (in languages like C++, Java, Python).

Example (C-style):

char s[] = "HELLO";

This is stored as:

'H' 'E' 'L' 'L' 'O' '\0'

⭐ 2. Characteristics of Strings

Strings are immutable in languages like Java, Python (once created, cannot be changed).
Strings are mutable in languages like C, C++ (character array can be modified).
Support indexing → s[i]
Support lexicographical comparison
Support substring extraction
Stored in continuous memory

⭐ 3. Operations on Strings

Common operations include:

Operation	Description	Time Complexity
Length	Count characters	O(n) or O(1) depending on language
Concatenation	Join two strings	O(n + m)
Copy	Copy contents	O(n)
Substring	Extract part of string	O(k)
Compare	Compare lexicographically	O(n)
Reverse	Reverse string	O(n)
Search	Find substring/pattern	O(n × m) naïve

⭐ 4. String Representation in Memory

A. C-style Strings (Null-terminated)

char s[] = {'C','O','D','E','\0'};

Terminated by null character \0
String length determined by scanning until \0

B. Java/Python Strings (Length-based)

Store explicit length, making length operation O(1).

⭐ 5. String Matching (Pattern Searching)

One of the most important applications of strings.

Given:

Text T of length n
Pattern P of length m

Find occurrences of P in T.

5.1 Naïve String Matching

Try matching P at every position of T.
Time: O(n × m)

5.2 Knuth–Morris–Pratt (KMP)

Efficient pattern matching using LPS (Longest Prefix Suffix) array.
Time: O(n + m)

5.3 Rabin–Karp Algorithm

Uses hashing to match patterns.
Average Time: O(n + m)
Worst Case: O(n × m)

5.4 Boyer–Moore Algorithm

Skips multiple characters by matching from right to left.
Best for large alphabets.
Time: O(n/m)

⭐ 6. Advanced String Data Structures

Used heavily in text processing, compilers, bioinformatics, search engines:

A. Trie (Prefix Tree)

Tree-like structure
Fast prefix search
Insert/Search → O(length of word)

Used in autocomplete systems.

B. Suffix Tree

Stores all suffixes of a string
substring search → O(m)
heavy memory usage

C. Suffix Array

Space-efficient alternative to suffix tree
Used in pattern matching, LCP arrays, DNA sequencing

D. Suffix Automaton

Minimal deterministic automaton for all substrings
Used in competitive programming and compilers

E. Aho–Corasick Automaton

Multiple-pattern matching
Used in virus scanners and spam filters

⭐ 7. String Applications

Strings are used in:

Text processing
Data compression (Huffman, LZW)
Compiler design (lexical analysis)
Natural language processing (NLP)
Pattern matching
Searching in documents
DNA and protein sequence analysis
Networking protocols
File path manipulation
Web URLs and parsing

⭐ 8. Example String Problems

Reverse a string
Check palindrome
Count vowels/consonants
Compare two strings
Pattern searching
Longest substring without repeating characters
Longest common prefix
Longest palindromic substring
Anagram detection

These questions often appear in exams and interviews.

⭐ Summary

A String is a sequence of characters stored in contiguous memory.
Major topics include:

Representation (null-terminated / length-based)
Common operations (concatenate, substring, compare)
Pattern matching algorithms (Naïve, KMP, Rabin-Karp, Boyer-Moore)
Advanced structures: Trie, Suffix Tree, Suffix Array
Heavy use in compilers, NLP, networking, search engines