简介
Sunday算法是Boyer-Moore算法的改进,效率有略微的提升:
在匹配失败时,处理文本串中参加匹配的最末位字符的下一位字符;
如果该字符没有在模式串中出现,则直接跳过,即右移位数=匹配串长度+1;
如果该字符在模式串中出现过,则右移位数=模式串中最右端的该字符到末尾的距离+1;
平均性能的时间复杂度为O(n),最差情况的时间复杂度为O(n*m)。
示例代码
如下为匹配单字节字符示例,而匹配宽字符时,除函数声明外还有字符集大小不同
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| char* SundaySearchA(char* txt, int tn, char* pat, short pn)
{ if (!txt || !pat || (pn <= 0) || (pn > tn)) return 0; const int shift_size = 0x100; int* shift = new int[shift_size]; for (int i = 0; i < shift_size; i++) { shift[i] = pn + 1; } for (short i = 0; i < pn; i++) { shift[pat[i]] = pn - i; } for (int i = 0; i < (tn - pn); i += shift[txt[i + pn]]) { short j; for (j = 0; j < pn; j++) { if (pat[j] != txt[i + j]) break; } if (j == pn) { delete[] shift; return (txt + i); } } delete[] shift; return 0; }
|
搜索字节码示例(暂不支持通配符)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| unsigned char* SundaySearchByte(unsigned char* dat, int dn, unsigned char* pat, short pn) { if (!dat || !pat || (pn <= 0) || (pn > dn)) return 0; const int shift_size = 0x100; int* shift = new int[shift_size]; __try { for (int i = 0; i < shift_size; i++) { shift[i] = pn + 1; } for (short i = 0; i < pn; i++) { shift[pat[i]] = pn - i; } for (int i = 0; i < (dn - pn); i += shift[dat[i + pn]]) { short j; for (j = 0; j < pn; j++) { if (pat[j] != dat[i + j]) break; } if (j == pn) { delete[] shift; return (dat + i); } } } __except (1) { } delete[] shift; return 0; }
|