The Rabinkarp Algorithm
The solution of working modulo q is not perfect, however, since t_{s} ≡ p (mod q) does not imply that t_{s} = p. On the other hand, if t_{s} ≢ p (mod q), then we definitely have that t_{s} ≠ p, so that shift s is invalid. We can thus use the test t_{s} ≡ p (mod q) as a fast heuristic test to rule out invalid shifts s. Any shift s for which t_{s} ≡ p (mod q) must be tested further to see if s is really valid or we just have a spurious hit. This testing can be done by explicitly checking the condition P[1 ‥ m] = T [s 1 ‥ s m]. If q is large enough, then we can hope that spurious hits occur infrequently enough that the cost of the extra checking is low.
The following procedure makes these ideas precise. The inputs to the procedure are the text T , the pattern P, the radix d to use (which is typically taken to be Σ), and the prime q to use.
RABINKARPMATCHER(T, P, d, q) 1 n ← length[T] 2 m ← length[P] 3 h ← dm1 mod q 4 p ← 0 5 t0 ← 0 6 for i ← 1 to m ▹ Preprocessing. 7 do p ← (dp P[i]) mod q 8 t0 ← (dt0 T[i]) mod q 9 for s ← 0 to n  m ▹ Matching. 10 do if p = t_{s} 11 then if P[1 ‥ m] = T [s 1 ‥ s m] 12 then print "Pattern occurs with shift" s 13 if s < n  m 14 then ts 1 ← (d(t_{s}  T[s 1]h) T[s m 1]) mod q
The procedure RABINKARPMATCHER works as follows. All characters are interpreted as radixd digits. The subscripts on t are provided only for clarity; the program works correctly if all the subscripts are dropped. Line 3 initializes h to the value of the highorder digit position of an mdigit window. Lines 48 compute p as the value of P[1 ‥ m] mod q and t_{0} as the value of T [1 ‥ m] mod q. The for loop of lines 914 iterates through all possible shifts s, maintaining the following invariant:

Whenever line 10 is executed, t_{s} = T[s 1‥ s m] mod q.
If p = t_{s} in line 10 (a "hit"), then we check to see if P[1 ‥ m] = T [s 1 ‥ s m] in line 11 to rule out the possibility of a spurious hit. Any valid shifts found are printed out on line 12. If s < n  m (checked in line 13), then the for loop is to be executed at least one more time, and so line 14 is first executed to ensure that the loop invariant holds when line 10 is again reached. Line 14 computes the value of t_{s 1} mod q from the value of t_{s} mod q in constant time using equation (32.2) directly.
RABINKARPMATCHER takes Θ(m) preprocessing time, and its matching time is Θ((n  m 1)m) in the worst case, since (like the naive stringmatching algorithm) the RabinKarp algorithm explicitly verifies every valid shift. If P = a^{m} and T = a^{n}, then the verifications take time Θ((n  m 1)m), since each of the n  m 1 possible shifts is valid.
In many applications, we expect few valid shifts (perhaps some constant c of them); in these applications, the expected matching time of the algorithm is only O((n  m 1) cm) = O(n m), plus the time required to process spurious hits. We can base a heuristic analysis on the assumption that reducing values modulo q acts like a random mapping from Σ* to Z_{q}. (See the discussion on the use of division for hashing in division method. It is difficult to formalize and prove such an assumption, although one viable approach is to assume that q is chosen randomly from integers of the appropriate size. We shall not pursue this formalization here.) We can then expect that the number of spurious hits is O(n/q), since the chance that an arbitrary t_{s} will be equivalent to p, modulo q, can be estimated as 1/q. Since there are O(n) positions at which the test of line 10 fails and we spend O(m) time for each hit, the expected matching time taken by the RabinKarp algorithm is
O(n) O(m(v n/q)),
where v is the number of valid shifts. This running time is O(n) if v = O(1) and we choose q ≥ m. That is, if the expected number of valid shifts is small (O(1)) and the prime q is chosen to be larger than the length of the pattern, then we can expect the RabinKarp procedure to use only O(n m) matching time. Since m ≤ n, this expected matching time is O(n).