字符串高级(KMP · 字典树 · 通配符匹配)
hardKMP字典树Trie字符串匹配
KMP 算法(字符串匹配,O(m+n))
核心思想:预处理模式串,生成失败函数(next 数组),匹配失败时跳过已匹配的前缀。
构建 next 数组
int[] buildNext(String pattern) {
int m = pattern.length();
int[] next = new int[m]; // next[i] = pattern[0..i] 的最长公共前缀后缀长度
next[0] = 0;
int j = 0; // 前缀指针
for (int i = 1; i < m; i++) {
while (j > 0 && pattern.charAt(i) != pattern.charAt(j)) {
j = next[j - 1]; // 回退
}
if (pattern.charAt(i) == pattern.charAt(j)) j++;
next[i] = j;
}
return next;
}
KMP 搜索
List<Integer> kmpSearch(String text, String pattern) {
int[] next = buildNext(pattern);
List<Integer> positions = new ArrayList<>();
int j = 0;
for (int i = 0; i < text.length(); i++) {
while (j > 0 && text.charAt(i) != pattern.charAt(j)) {
j = next[j - 1];
}
if (text.charAt(i) == pattern.charAt(j)) j++;
if (j == pattern.length()) {
positions.add(i - j + 1);
j = next[j - 1];
}
}
return positions;
}
字典树(Trie)
实现(LeetCode 208)
class Trie {
private TrieNode root = new TrieNode();
class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isEnd;
}
void insert(String word) {
TrieNode cur = root;
for (char c : word.toCharArray()) {
int idx = c - 'a';
if (cur.children[idx] == null) cur.children[idx] = new TrieNode();
cur = cur.children[idx];
}
cur.isEnd = true;
}
boolean search(String word) {
TrieNode node = findNode(word);
return node != null && node.isEnd;
}
boolean startsWith(String prefix) {
return findNode(prefix) != null;
}
private TrieNode findNode(String word) {
TrieNode cur = root;
for (char c : word.toCharArray()) {
int idx = c - 'a';
if (cur.children[idx] == null) return null;
cur = cur.children[idx];
}
return cur;
}
}
单词搜索 II(LeetCode 212,字典树 + 回溯)
构建 Trie,用 DFS 在网格中同步搜索:
// 将 words 全部插入 Trie
// DFS 时,跟随 Trie 节点指针,遇到 isEnd=true 即找到单词
// 找到后将 isEnd 置 false 避免重复(剪枝)
通配符匹配(LeetCode 44)
? 匹配单字符,* 匹配任意序列(包括空):
boolean isMatch(String s, String p) {
int m = s.length(), n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
// 处理 p 以 * 开头且 s 为空的情况
for (int j = 1; j <= n; j++) dp[0][j] = dp[0][j-1] && p.charAt(j-1) == '*';
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
char si = s.charAt(i-1), pj = p.charAt(j-1);
if (pj == '*') {
dp[i][j] = dp[i-1][j] // * 匹配一个字符(s 减一个)
|| dp[i][j-1]; // * 匹配空(p 减一个 *)
} else {
dp[i][j] = dp[i-1][j-1] && (si == pj || pj == '?');
}
}
}
return dp[m][n];
}
正则表达式匹配(LeetCode 10)
. 匹配单字符,* 匹配前一个字符零次或多次:
boolean isMatch(String s, String p) {
int m = s.length(), n = p.length();
boolean[][] dp = new boolean[m + 1][n + 1];
dp[0][0] = true;
for (int j = 2; j <= n; j++) dp[0][j] = dp[0][j-2] && p.charAt(j-1) == '*';
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
char si = s.charAt(i-1), pj = p.charAt(j-1);
if (pj == '*') {
dp[i][j] = dp[i][j-2] // * 消除前一个字符(零次)
|| (dp[i-1][j] && (p.charAt(j-2) == '.' || p.charAt(j-2) == si)); // 多次
} else {
dp[i][j] = dp[i-1][j-1] && (si == pj || pj == '.');
}
}
}
return dp[m][n];
}
字符串哈希(Rabin-Karp)
快速匹配/比较子串,O(n) 预处理,O(1) 查询区间哈希:
long MOD = 1_000_000_007L, BASE = 31;
long[] hash = new long[n + 1], pow = new long[n + 1];
pow[0] = 1;
for (int i = 0; i < n; i++) {
hash[i + 1] = (hash[i] * BASE + s.charAt(i) - 'a' + 1) % MOD;
pow[i + 1] = pow[i] * BASE % MOD;
}
// 区间 [l, r] 的哈希值
long getHash(int l, int r) {
return (hash[r + 1] - hash[l] * pow[r - l + 1] % MOD + MOD) % MOD;
}