[Bioperl-l] Allowing One error in Sequence matching
Abhishek Pratap
abhishek.vit at gmail.com
Wed Sep 16 21:41:50 UTC 2009
Hi All
I am not able to think of smart way to do sequence matching allowing
userdefined number of mismatches.
For eg:
Given Sequence : AGCT will be considered a match to reference if any
one base pair position #(1,2,3,4) has a mismatch that is [ACGTN] so
the possible matches could be
This is for position 1.
AGCT
GGCT
CGCT
TGCT
NGCT
and likewise for each position.
any nice regular expression. One way that I could think was to
generate all the possible tags for a given sequence and then do the
matching. It will be a computationally expensive for long dataset .
Any neat method ?
Thanks,
-Abhi
More information about the Bioperl-l
mailing list