Wednesday, September 12, 2012

Brute Force(Naive) String Matching Algorithm

           When we talk about a string matching algorithm, every one can get a simple string matching technique. That is starting from first letters of the text and first letter of the pattern check whether these two letters are equal. if it is, then check second letters of the text and pattern. If it is not equal, then move first letter of the pattern to the second letter of the text. then check these two letters. this is the simple technique everyone can thought.
          Brute Force string matching algorithm is also like that. Therefore we call that as Naive string matching algorithm. Naive means basic. Lets learn this method using an example.

Let our text (T) as,
                    THIS IS A SIMPLE EXAMPLE
and our pattern (P) as,
                     SIMPLE

Red Boxes-Mismatch                          Green Boxes-Match
 In above red boxes says mismatch letters against letters of the text and green boxes says match letters against letters of the text. According to the above
      In first raw we check whether first letter of the pattern is matched with the first letter of the text. It is mismatched, because "S" is the first letter of pattern and "T" is the first letter of text. Then we move the pattern by one position.  Shown in second raw.
    Then check first letter of the pattern with the second letter of text. It is also mismatched. Likewise we continue the checking and moving process. In fourth raw we can see first letter of the pattern matched with text. Then we do not do any moving but we increase testing letter of the pattern. We only move the position of pattern by one when we find mismatches. Also in last raw, we can see all the letters of the pattern matched with the some letters of the text continuously.
                   
You can find C Source Code for Brute Force String Matching Algorithm from following link.
                               Brute Force String Matching Algorithm

Running Time Analysis Of  Brute Force String Matching Algorithm

     let length of the text as n (|text| = n) and length of the pattern as m (|pattern| = m).

In worst case, in each position  we should have to make m comparisons.
Also we should have to make these m comparisons in (n-m+1) positions.
Therefore all the comparisons in total string matching process is {m*(n-m+1)}.
Therefore running time of Brute Force String Matching Algorithm is O(m(n-m+1)). Simply we say that as O(nm) if n<<<m.

Fro the above example, |text|= 24. but we do comparisons only for first 16 letters. Then we can get |text|=  16 and |pattern|=6.
               In worst case we should have to compare these 6 letters of the pattern with 6 letters of the text in each position. Also we can see there are 11 positions for first 16 letters of text. that is (|text|-|pattern|+1) = 11. Positions we can find using number of raw's in above example.
       Therefore, all the comparisons for above example is {|pattern|*[|text|-|pattern|+1]}. This means, to match our pattern with first 16 letters of text we should have to make 6*11 comparisons. But if |text|<<<|pattern|, we simply get all the comparison by |text|*|pattern|. Therefore running time should be O(|pattern|*|text|).

Advantages

  1. Very simple technique and also that does not require any preprocessing. Therefore total running time is the same as its matching time.

Disadvantages

  1. Very inefficient method. Because this method takes only one position movement in each time.

4 comments: