Advanced Strings (KMP, Trie, Pattern Matching)
KMP Algorithm (String Matching, O(m+n))
Core Idea: Pre-process the pattern string to generate a failure function (next array). When a mismatch occurs during searching, use this array to skip redundant comparisons by shifting the pattern to its next possible prefix match.
Building the next Array (Prefix Function)
int[] buildNext(String pattern) {
int m = pattern.length();
// next[i] stores the length of the longest proper prefix of pattern[0..i]
// that is also a suffix of pattern[0..i]
int[] next = new int[m];
next[0] = 0;
int j = 0; // pointer for prefix
for (int i = 1; i < m; i++) {
while (j > 0 && pattern.charAt(i) != pattern.charAt(j)) {
j = next[j - 1]; // backtracking
}
if (pattern.charAt(i) == pattern.charAt(j)) j++;
next[i] = j;
}
return next;
}
KMP Search
List<Integer> kmpSearch(String text, String pattern) {
if (pattern.isEmpty()) return new ArrayList<>();
int[] next = buildNext(pattern);
List<Integer> positions = new ArrayList<>();
int j = 0; // pointer for pattern
for (int i = 0; i < text.length(); i++) {
while (j > 0 && text.charAt(i) != pattern.charAt(j)) {
j = next[j - 1];
}
if (text.charAt(i) == pattern.charAt(j)) j++;
if (j == pattern.length()) {
positions.add(i - j + 1);
j = next[j - 1]; // continue searching for next occurrence
}
}
return positions;
}
Trie (Prefix Tree)
Implementation (LeetCode 208)
class Trie {
private TrieNode root = new TrieNode();
class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isEnd;
}
void insert(String word) {
TrieNode cur = root;
for (char c : word.toCharArray()) {
int idx = c - 'a';
if (cur.children[idx] == null) cur.children[idx] = new TrieNode();
cur = cur.children[idx];
}
cur.isEnd = true;
}
boolean search(String word) {
TrieNode node = findNode(word);
return node != null && node.isEnd;
}
boolean startsWith(String prefix) {
return findNode(prefix) != null;
}
private TrieNode findNode(String word) {
TrieNode cur = root;
for (char c : word.toCharArray()) {
int idx = c - 'a';
if (cur.children[idx] == null) return null;
cur = cur.children[idx];
}
return cur;
}
}
Word Search II (LeetCode 212, Trie + Backtracking)
Insert all dictionary words into a Trie, then perform DFS on the grid starting from each cell. Follow the Trie pointers during DFS; if a node's isEnd is true, a word is found.
Pruning Tip: Set isEnd = false once a word is found to avoid duplicate results.
Wildcard Matching (LeetCode 44)
? matches any single character; * matches any sequence of characters (including empty).
boolean isMatch(String s, String p) {
int m = s.length(), n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
// Handle leading asterisks for empty s
for (int j = 1; j <= n; j++) {
dp[0][j] = dp[0][j-1] && p.charAt(j-1) == '*';
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
char si = s.charAt(i-1), pj = p.charAt(j-1);
if (pj == '*') {
dp[i][j] = dp[i-1][j] // * matches one or more characters (consume s[i])
|| dp[i][j-1]; // * matches zero characters (consume p[j])
} else {
dp[i][j] = dp[i-1][j-1] && (si == pj || pj == '?');
}
}
}
return dp[m][n];
}
Regular Expression Matching (LeetCode 10)
. matches any single character; * matches zero or more occurrences of the preceding element.
boolean isMatch(String s, String p) {
int m = s.length(), n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
// Initial states for cases like a* or a*b*
for (int j = 2; j <= n; j++) {
dp[0][j] = dp[0][j-2] && p.charAt(j-1) == '*';
}
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
char si = s.charAt(i-1), pj = p.charAt(j-1);
if (pj == '*') {
dp[i][j] = dp[i][j-2] // Zero occurrence of the preceding element
|| (dp[i-1][j] && (p.charAt(j-2) == '.' || p.charAt(j-2) == si)); // One or more occurrences
} else {
dp[i][j] = dp[i-1][j-1] && (si == pj || pj == '.');
}
}
}
return dp[m][n];
}
String Hashing (Rabin-Karp)
Enables O(1) comparison of substrings after O(n) pre-calculation of prefix hashes.
long MOD = 1_000_000_007L, BASE = 31;
long[] hashes = new long[n + 1];
long[] powers = new long[n + 1];
powers[0] = 1;
for (int i = 0; i < n; i++) {
hashes[i + 1] = (hashes[i] * BASE + (s.charAt(i) - 'a' + 1)) % MOD;
powers[i + 1] = (powers[i] * BASE) % MOD;
}
// Get hash of substring s[l..r]
long getHash(int l, int r) {
long h = (hashes[r + 1] - hashes[l] * powers[r - l + 1]) % MOD;
return (h + MOD) % MOD; // Handle negative result
}