Finding Z value of a string is requires for Z algorithm. Z algorithm is one of the linear string matching algorithm. As
pre-processing of Z algorithm we find Z value of given string. We will denote string by S an arbitrary string. Let S be a string and i >1 one of its positions.
Starting at i, we consider all the substrings S[i..j], where i ≤ j ≤
|S|, so that S[i..j] matches a prefix of S itself. Among all the
possible values j may take, select the greatest one, max(j) and denote
the number [max(j) − i + 1] by Zi(S). If S[i] is different from S[1],
then such a j does not exist and Zi(S) is set to 0.
I
think it is very hard to understand. Lets consider finding Z value
using an example. Finding Z value starts from 2 to length of the string.
That means if our string is "ABCDEF", we only find Z2 to Z6 except Z1.
Because we can not find value for Z1.Lets get our string as S = AABCDAABCXYAABCDAABCDX
Zi is equal to the length of the longest common prefix of S string and S[i to |S|] string.
Finding Z2
Devide string in to two sub-strings (first letter as a sub-string and rest as another sub-string)
S1 = AABCDAABCXYAABCDAABCDX
S2 = ABCDAABCXYAABCDAABCDX
Then find length of the longest common prefixes of these two substrings. Look How to find prefixes of a string if you are not well with prefixes.
Prefix of S1 sub-string - A, AA , AAB , AABC , AABCD , AABCDA , AABCDAA , AABCDAAB , AABCDAABC , AABCDAABCX , AABCDAABCXY , AABCDAABCXYA , AABCDAABCXYAA , AABCDAABCXYAAB , AABCDAABCXYAABC , AABCDAABCXYAABCD , AABCDAABCXYAABCDA , AABCDAABCXYAABCDAA , AABCDAABCXYAABCDAAB , AABCDAABCXYAABCDAABC , AABCDAABCXYAABCDAABCD , AABCDAABCXYAABCDAABCDX
Prefix of S2 substring - A , AB , ABC , ABCD , ABCDA , ABCDAA , ABCDAAB , ABCDAABC , ABCDAABCX , ABCDAABCXY , ABCDAABCXYA , ABCDAABCXYAA , ABCDAABCXYAAB , ABCDAABCXYAABC , ABCDAABCXYAABCD , ABCDAABCXYAABCDA , ABCDAABCXYAABCDAA , ABCDAABCXYAABCDAAB , ABCDAABCXYAABCDAABC , ABCDAABCXYAABCDAABCD , ABCDAABCXYAABCDAABCDX
Highlighted by green are the common prefixes of these two sub-strings. Length of the longest common prefix become as Z value of that position. Therefore, Z2 = 1
Finding Z3
Devide string in to two sub-strings (first two letters as a sub-string and rest as another sub-string)
S1 = AABCDAABCXYAABCDAABCDX
S2 = BCDAABCXYAABCDAABCDX
Then find length of the longest common prefixes of these two substrings.
Prefix of S1 sub-string - A, AA , AAB , AABC , AABCD , AABCDA , AABCDAA , AABCDAAB , AABCDAABC , AABCDAABCX , AABCDAABCXY , AABCDAABCXYA , AABCDAABCXYAA , AABCDAABCXYAAB , AABCDAABCXYAABC , AABCDAABCXYAABCD , AABCDAABCXYAABCDA , AABCDAABCXYAABCDAA , AABCDAABCXYAABCDAAB , AABCDAABCXYAABCDAABC , AABCDAABCXYAABCDAABCD , AABCDAABCXYAABCDAABCDX
Prefix of second substring - B , BC ,BCD , BCDA , BCDAA , BCDAAB , BCDAABC , BCDAABCX , BCDAABCXY , BCDAABCXYA , BCDAABCXYAA , BCDAABCXYAAB , BCDAABCXYAABC , BCDAABCXYAABCD , BCDAABCXYAABCDA , BCDAABCXYAABCDAA , BCDAABCXYAABCDAAB , BCDAABCXYAABCDAABC , BCDAABCXYAABCDAABCD , BCDAABCXYAABCDAABCDX
There are no any common prefixes in these two sub-strings. Therefore, Z3 = 0
Likewise we can find other Z values.
Finding Z4
S1 = AABCDAABCXYAABCDAABCDX
S2 = CDAABCXYAABCDAABCDX Z4=0
Finding Z5
S1 = AABCDAABCXYAABCDAABCDX
S2 = DAABCXYAABCDAABCDX Z5=0
Finding Z6
S1 = AABCDAABCXYAABCDAABCDX
S2 = AABCXYAABCDAABCDX Z6=4
Finding Z7
S1 = AABCDAABCXYAABCDAABCDX
S2 = ABCXYAABCDAABCDX Z7=1
Finding Z8
S1 = AABCDAABCXYAABCDAABCDX
S2 = BCXYAABCDAABCDX Z8=0
Finding Z9
S1 = AABCDAABCXYAABCDAABCDX
S2 = CXYAABCDAABCDX Z9=0
Finding Z10
S1 = AABCDAABCXYAABCDAABCDX
S2 = XYAABCDAABCDX Z10=0
Finding Z11
S1 = AABCDAABCXYAABCDAABCDX
S2 = YAABCDAABCDX Z11=0
Finding Z12
S1 = AABCDAABCXYAABCDAABCDX
S2 = AABCDAABCDX Z12=9
Finding Z13
S1 = AABCDAABCXYAABCDAABCDX
S2 = ABCDAABCDX Z13=1
Finding Z14
S1 = AABCDAABCXYAABCDAABCDX
S2 = BCDAABCDX Z14=0
Finding Z15
S1 = AABCDAABCXYAABCDAABCDX
S2 = CDAABCDX Z15=0
Finding Z16
S1 = AABCDAABCXYAABCDAABCDX
S2 = DAABCDX Z16=0
Finding Z17
S1 = AABCDAABCXYAABCDAABCDX
S2 = AABCDX Z17=5
Finding Z18
S1 = AABCDAABCXYAABCDAABCDX
S2 = ABCDX Z18=1
Finding Z19
S1 = AABCDAABCXYAABCDAABCDX
S2 = BCDX Z19=0
Finding Z20
S1 = AABCDAABCXYAABCDAABCDX
S2 = CDX Z20=0
Finding Z21
S1 = AABCDAABCXYAABCDAABCDX
S2 = DX Z21=0
Finding Z22
S1 = AABCDAABCXYAABCDAABCDX
S2 = X Z22=0
Very useful
ReplyDelete