### Exact String Matching: the brute force method

• The Exact String Matching problem

• Exact String Matching:

• Given:

 A text string T = t0 t1 t2 ... tn-1                   length(T) = n A pattern string P = p0 p1 p2 ... pm-1          length(P) = m

• Problem:

 Determine if P is a substring of T:

• Example:

 ``` T = acgttagatactaggatgcca P = gata Solution: acgttagatactaggatgcca gata ```

• The brute force approach

• Brute Force:

 Generate every possible outcome           Test if a solution is correct

• Psuedo code for any brute force algorithm:

 ``` for ( every possible outcome i ) do { if ( i is a correct solution ) { print solution i; (exit if you only need one solution) } } ```

• Brute force exact string matching

• Try every spot in the text string T to find P:

 ``` T = acgttagatactaggatgcca P = gata Try: acgttagatactaggatgcca gata acgttagatactaggatgcca gata acgttagatactaggatgcca gata acgttagatactaggatgcca gata and so on... ```

• Psuedo code:

 ``` for ( pos = 0; pos < n-m; pos++ ) { if ( T[pos..pos+(m-1)] == P ) { return(pos); // Found match.... } } ```

• Java code:

 ``` public static int BruteForce(String T, String P) { int i, j, m, n; n = T.length(); m = P.length(); T_ruler = ruler(n); P_ruler = ruler(m); i = 0; j = 0; for ( i = 0; i < n-m; i++ ) { if ( P.equals( T.substring(i, i+m) ) ) { return( i ); // Found P at position i in T } } return(-1); } ```

• Example Program: (Demo above code)

How to run the program:

 Right click on link(s) and save in a scratch directory To compile:   javac BruteForce.java To run:          java BruteForce

• Performance of the Brute Force Exact String matching

• Simple loop counting will give:

 Running time = (n-m+1)*m (You need m compare operations to test: T[pos..pos+(m-1)] == P) Running time = O(n*m)

• Can we do better ???

 Yes, but it will cost you....

• Another way to say this:

 "Do it" is easy, "do it well" is hard

• My college professor (Herman Bavinck) at TH Delft has a funny way to express this.

He has an interesting law of conservation, like the law of conservation of matter and energy in Physics.

His law is:

• The law of conservation of misery

 No matter which route you take to solve a problem (or prove a theorem - he was a Mathematician), it will be equally miserable

Herman Bavinck

• In terms of algorithms:

• If you write a simple but inefficient algorithm, then:

 Your level of misery in writing the code is lower, but your level of misery in waiting for the code to complete is higher,

• If you write a harder but more efficient algorithm, then:

 Your level of misery in writing the code is higher, but your level of misery in waiting for the code to complete is lower,