[Bioperl-l] Fuzzy matching sequences

Jurgen Pletinckx jurgen.pletinckx@algonomics.com
Fri, 16 Mar 2001 12:29:16 +0100


# From: ... Kris Boulez
# 
# ... if there are sequences containing the x-mer when allowing
# one point mutation (anywhere in the overlap). To make things even worse:
# some of these sequences contain N .
# 
# Do people know of a way to do this using perl regexes (I couldn't find
# anything in the Friedl book). The only solution I can think of is to
# create a blastable DB from the x-mers and blast all sequences against
# this DB. 

Not a regex solution, but: this sounds like a job for String::Approx 
(disclaimer: haven't used it myself.) 

http://search.cpan.org/search?dist=String-Approx

It seems you will have to fiddle a bit with the allowed edit distance
because of your ambiguous bases, though.

-- 
Jurgen Pletinckx
Algonomics NV