[Bioperl-l] Allowing One error in Sequence matching

Abhishek Pratap abhishek.vit at gmail.com
Wed Sep 16 21:41:50 UTC 2009


Hi All

I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.

For eg:

Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4)  has a mismatch that is  [ACGTN] so
the possible matches could be

This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.

any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?

Thanks,
-Abhi



More information about the Bioperl-l mailing list