Robin-Karp algorithm 字符串的匹配

最新推荐文章于 2024-12-24 15:35:39 发布

转载最新推荐文章于 2024-12-24 15:35:39 发布 · 263 阅读

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/feng9exe/p/10003411.html

有关字符串的匹配问题，有很好的算法，即KMP算法，但是还有一种其实经常使用到的算法是Rabin-Karp算法，它是使用hash的原理来进行字符串匹配的。具体的做法如下。

Rabin-Karp算法是由Rabin和Karp提出的一个在实际中有比较好应用的字符串匹配算法，此算法的预处理时间为O(m)，但它的在最坏情况下的时间复杂度为O((2n-m+1)m)，而平均复杂度接近O(m+n)，此算法的主要思想就是通过对字符串进行哈稀运算，使得算法可以容易的排除大量的不相同的字符串，假设模式字符串的长度为m，利用
Horner法则p = p[m] + 10(p[m -1] + 10(p[m-2]+...+10(p[2]+10p[1])...))，求出模式字符串的哈稀值p,而对于文本字符串来说，对应于每个长度为m的子串的哈稀值为t(s+1)=10(t(s)-10^(m-1)T[s+1])+T[s+m+1]，然后比较此哈稀值与模式字符串的哈稀值是否相等，若不相同，则字符串一定不同，若相同，则需要进一步的按位比较，所以它的最坏情况下的时间复杂度为O（mn）。

Rabin-Karp is a good example of a randomized algorithm(if we pick M in some random way).We get no guarantee the algorithm runs in O(n+m)time, because we may get unlucky and have the hash values regularly collide with spurious mathces. Still, the odds are heavily in out favor-if the hash function returns values uniformly from 0 to M-1, the probability of a false collision should be 1/M.This is quite reasonable:ifM=.n,there should only be one false collision per string.and if M = n^k for k>=2, the odds are greate we will never see any false collisions.

转载于:https://www.cnblogs.com/feng9exe/p/10003411.html