Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Brute-Force Algorithm
The brute-force pattern function BruteForceMatch(T, P, m, n)
Brute Force
matching algorithm compares Input text T of size n and pattern
the pattern P with the text T P of size m
for each possible shift of P Output starting index of a
relative to T, until either substring of T equal to P or −1
n a match is found, or if no such substring exists
n all placements of the pattern for (i = 0; i< n ; i ++){
have been tried /* test shift i of the pattern */
Brute-force pattern matching j = 0;
runs in time O(nm)
while (j < m && T[i + j] = = P[j])
Example of worst case: j = j + 1;
T = aaa … ah
if ( j == m)
n
n P = aaah
return i ; /* match at i */
n may occur in images and
DNA sequences
n unlikely in English text return −1; /* no match */
Pattern Matching 3 Pattern Matching 4
1
Brute Force-Complexity(cont.)
Given a pattern M characters in length, and a text N
Boyer-Moore’s Algorithm (1)
characters in length... The Boyer-Moore’s pattern matching algorithm is based on two
Best case if pattern not found: Always mismatch on heuristics
first character. For example, M=5. Looking-glass heuristic: Compare P with a subsequence of T
moving backwards
Character-jump heuristic: When a mismatch occurs at T[i] = c
n If P contains c, shift P to align the last occurrence of c in P with T[i]
n Else, shift P to align P[0] with T[i + 1]
Example
a p a t t e r n m a t c h i n g a l g o r i t h m
1 3 5 11 10 9 8 7
r i t h m r i t h m r i t h m r i t h m
2 4 6
Total number of comparisons: N r i t h m r i t h m r i t h m
Best case time complexity: O(N)
Pattern Matching 7 Pattern Matching 8
Example Analysis
Boyer-Moore’s algorithm
runs in time O(nm + s) a a a a a a a a a
a b a c a a b a d c a b a c a b a a b b
Example of worst case: 6 5 4 3 2 1
1 n T = aaa … a b a a a a a
a b a c a b
n P = baaa
12 11 10 9 8 7
4 3 2 13 12 11 10 9 8 The worst case may occur in b a a a a a
a b a c a b a b a c a b images and DNA sequences
5 7 but is unlikely in English text 18 17 16 15 14 13
a b a c a b a b a c a b Boyer-Moore’s algorithm is b a a a a a
significantly faster than the 24 23 22 21 20 19
6
brute-force algorithm on b a a a a a
a b a c a b
English text
2
KMP’s Algorithm (1) KMP’s Algorithm (2)
Knuth-Morris-Pratt’s The failure function can function FailureFunction( P)
j 0 1 2 3 4 5
algorithm preprocesses the be represented by an
i = 1;
P[j] a b a a b a
pattern to find matches of j = 0;
prefixes of the pattern with F(j) 0 0 1 1 2 3 array and can be F[0] = 0;
while (i < m){
the pattern itself computed in O( m) time if (P[i] == P[j]){
The failure function F(i) is . . a b a a b x . . . . . F[i ] = j + 1;
i ++;
defined as the size of the j ++;
largest prefix of P[0..j] that is }
also a suffix of P[1..j] else if ( j > 0)
a b a a b a j = F[j − 1];
Knuth-Morris-Pratt’s else{
j
algorithm modifies the brute- F[i ] = 0;
force algorithm so that if a i ++;
a b a a b a
mismatch occurs at P[j] ≠ T[i] }
we set j ← F(j − 1) }
F(j − 1) return F ;