[Bioperl-l] string comparision mismatches and matches

Roopa Raghuveer rtbio.2009 at gmail.com
Wed Feb 3 13:50:07 UTC 2010


Hello all,

I have a problem. I would like to compare two strings of equal length using
Perl and trying to count the number of matches. I have used an algorithm to
count the matches,but I found them to be time consuming in counting the
mismatches. Could any one suggest a better algorithm for this?

Ex:- Let the two sequences be

input:- ACCTCCTCCTCGAGTATGTG

target:- TATCTTGCGCCGGAGATAAT

The no of matches are 5.

In my program, I have used two indices which run along the string in the
forward direction and reverse direction and extracting two characters at a
time and comparing them with the respective characters in the  output
sequence. i.e.,

i/p:- ACCTCCTCCTCGAGTATGTG

I have extracted 'A' in the forward direction and 'G' in the reverse
direction from the i/p.

target:- TATCTTGCGCCGGAGATAAT

I have extracted 'T' in the forward direction and 'T' in the reverse
direction from the target and compared 'A' with 'T' and 'G' with 'T'. In
this way, I have proceeded along the complete length of the string and
counting the matches . But it seems to be time consuming though its working.
Can anybody suggest a better way?

Regards,
Roopa.



More information about the Bioperl-l mailing list