[Biopython] Get all alignments of a sequence against another

Kevin Rue kevin.rue at ucdconnect.ie
Fri Mar 14 10:57:36 UTC 2014


Cheers!

(Man, we're a team ;-) )

Kevin


On 14 March 2014 10:53, Tal Einat <taleinat at gmail.com> wrote:

> On Fri, Mar 14, 2014 at 11:16 AM, Kevin Rue <kevin.rue at ucdconnect.ie>
> wrote:
> > >>> import fuzzysearch
> > >>>
> fuzzysearch.find_near_matches_with_ngrams("GGGTTLTTSS","XXXXXXXXXXXXXXXXXXXGGGTTVTTSSAAAAAAAAAAAAAGGGTTLTTSSAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBGGGTTLTTSS",
> > >>> 1)
> >
> > The output will find two matches.
> > Out[7]: [Match(start=89, end=99, dist=0), Match(start=89, end=99,
> dist=0)]
> >
> > BUG:
> > I did notice that the second match is reported twice instead and I assume
> > this is a bug where the first match was somehow replaced by the second,
> > which is why I copied Tal (the developer of this package) to this email
> >
> > Another example where I added you sequence (with a mismatch) a third
> time:
> >
> >>>>
> >>>>
> fuzzysearch.find_near_matches_with_ngrams("GGGTTLTTSS","XXXXXXXXXXXXXXXXXXXGGGTTVTTSSAAAAAAAAAAAAAGGGTTVTTSSAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBGGGTTLTTSS",
> >>>> 1)
> >
> > returns
> > Out[9]:
> > [Match(start=42, end=52, dist=1),
> >  Match(start=99, end=109, dist=0),
> >  Match(start=99, end=109, dist=0)]
> >
> > You can see three matches, one of the mismatched sequence was detected
> > correctly (edit distance of 1), but the bug seems to duplicate the last
> > match and replace the one before the last match with it.
> >
> > Tal, can you fix that? I will add the issue to your repository :)
>
> Thanks for bringing this to my attention! Fixed.
>
> Upgrade to version 0.2.1 and your example will work as expected.
>
> (To upgrade, run: pip install --upgrade fuzzysearch)
>
> - Tal Einat
>



-- 
Kévin RUE-ALBRECHT
Wellcome Trust Computational Infection Biology PhD Programme
University College Dublin
Ireland
http://fr.linkedin.com/pub/k%C3%A9vin-rue/28/a45/149/en




More information about the Biopython mailing list