Sunday, October 21, 2012

Rabin Carp String Matching Algorithm

     Rabin Carp Algorithm is also one of the string matching algorithm. This algorithm is also an improved version of Naive or Brute Force String Matching Algorithm. Because this algorithm is the a very basic sub-string matching algorithm, but it’s good for some reasons. For example it doesn’t require preprocessing of the text or the pattern. The problem is that it’s very slow. That is why in many cases brute force matching can’t be very useful.
     Michael O. Rabin and Richard M. Karp came up with the idea of hashing the pattern and to check it against a hashed sub-string from the text in 1987.Rabin Carp algorithm is one of the better string matching algorithm than Brute Force algorithm. This algorithm avoid comparison of every character of pattern with characters of text in each position. To do that this algorithm uses hashing. Hashing is the process of converting our data in to numerical value. To convert data in to numerical value we should have assign a numerical value for each text in the alphabet and using any hashing functions we can calculate hash value of any pattern. 
As an example if our pattern P=AABCB and we get ascii values of characters.
Then ascii value of A=65,B=66,C=67 and we can get hash value of above pattern using any hashing functions.
Hash function 1
     hash value h(P) = 65+65+66+67+66 = 329
Hash function 2
     hash value h(P) = 65*1+65*2+66*3+67*4+66*5 =991
Like above you can use any hash functions to calculate hash values.
We can say that if two strings are equal, then hash values of these two string must be same. But if hash values of two strings are equal then we can't say these two strings are equal. I may be or not.  This is the basic idea of the Rabin Carp algorithm.

Algorithm
 Compute hash value of pattern[h(P)]  
    Compute hash value of sub string of text[h(t)]  
    If h(P)=h(t)  
      Compare pattern and sub string character by character  
      If mismatch found  
         move by one position and go to second step  
      Else  
         matching sub string found in the text  
    Else  
      move by one position and go to second step  
Lets learn the algorithm using an example.
     Alphabet S = {A,B,C,D,E,F,G,H}
     Text T = AABDCFFACGABBCABHGADEAADG
     Pattern P = BBCABHG

Here i am going to assign value of character in the alphabet instead of assigning ascii values.
Value(A) = 1
Value(B) = 2
Value(C) = 3
Value(D) = 4
Value(E) = 5
Value(F) = 6
Value(G) = 7
Value(H) = 8
     Then hash value of pattern  h(P) = 2+2+3+1+2+8+7 = 25




Hash value of [AABDCFF] is 23. Hash values are not equal. Sub string is not matched. Move by one position.
Hash value of [ABDCFFA] is 23. Hash values are not equal. Sub string is not matched. Move by one position. 

Hash value of [BDCFFAC] is 25. Hash values are equal. Sub string may be matched. Compare pattern with sub string in the text.
First letter B matched with sub string. But second one mismatched. Move by one position.
Hash value of [DCFFACG] is 29. Hash values are not equal. Sub string is not matched. Move by one position. 
Hash value of [CFFACGA] is 26. Hash values are not equal. Sub string is not matched. Move by one position. 
Hash value of [FFACGAB] is 25. Hash values are equal. Sub string may be matched. Compare pattern with sub string in the text.
First letter B  mismatched. Move by one position. Like these you can find matching sub string.
C implementation of Rabin Carp Algorithm

Complexity

The Rabin-Karp algorithm has the complexity of O(nm) where n, of course, is the length of the text, while m is the length of the pattern. So where it is compared to brute-force matching? Well, brute force matching complexity is O(nm), so as it seems there’s no much gain in performance. However it’s considered that Rabin-Karp’s complexity is O(n+m) in practice, and that makes it a bit faster, as shown on the chart below.
Rabin-Karp's complexity is O(nm), but in practice it's O(n+m)!
Note that the Rabin-Karp algorithm also needs O(m) preprocessing time.

Advantages

  1. Not faster than brute force matching in theory, but in practice its complexityis O(n+m)
  2. Good hashing function it can be quite effective and it’s easy to implement!
  3. Multiple pattern matching support
  4. Good for plagiarism, because it can deal with multiple pattern matching!

Disadvantages

  1. There are lots of string matching algorithms that are faster than O(n+m)
  2. It’s practically as slow as brute force matching and it requires additional space
Rabin-Karp is a great algorithm for one simple reason – it can be used to match against multiple pattern. This makes it perfect to detect plagiarism even for larger phrases.

Friday, October 19, 2012

Computer Boot Up Process

What is computer booting?
        Process of bringing up the operating system for users is called computer boot.
This is a simple classification. But behind this word called computer boot, there are sequence of activities done by your computer.
Following is  booting process of Linux operating systems.
What happen when you press the power button of your computer?
        After pressing power button, first of all power is going to power unit of your machine. Then power unit provide amount of power need by each device in mother board and sends a signal to BIOS (Basic Input Output System) chip in mother board. Then BIOS chip starts a process called POST (Power On Self Test). Under this POST process BIOS chip,
  1. Check whether all the devices receives right amount of power and memory is corrupted.
  2. Then check boot order and according to the boot order find boot loader.
Boot loader is a small program and it is in the first sector or in first 512 bytes in your external memory. It may be floppy drive, CD ROM or hard disk. Here is a simple boot loader in assembly. visit
If BIOS found boot loader in external memory, first sector or first 512 bytes of that external memory device loaded in to the main memory(RAM) as first instruction of the boot loader is in 0x7C00 memory address in RAM.
After loading boot loader in to the RAM , address of the first instruction(0x7C00) of boot loader send to PC(Program Counter). Then CPU will execute the boot loader. When boot loader running, it search for the Kernal inside the hard disk. If it is found, then it will loaded in to the RAM and address of first instruction of the kernal send to PC. Then CPU will start execution of the Kernal.
Then kernal load user program called "INIT" and hands over the machine to the user. INIT is the parent process of all the other processes.
INIT process' first job is to make sure your disks are working normally. Then INIT will start several deamons. Deamons are some programs. As a first deamon, INIT will start a program called "GETTY". This program will prepare the machine for users. Nowadays INIT will start several copies of "GETTY"(7 or 8). Therefore you have several virtual consoles. Then INIT will start several deamons like networking and other services.
The most important deamon is xserver. xserver program will manage your display, key board and mouse. Also xservers' main job is to produce color pixel graphics which you normally see on your screen. Last part of the boot process is graphical login. That produces by a program called  "Display Manager".