[Bioperl-l] StandAloneBlast or bl2seq quietly converts Ns to Ts

Jason Stajich jason.stajich at duke.edu
Mon May 9 15:53:09 EDT 2005


turn low complexity filtering off.
The -F F cmd-line option.

-jason

On May 9, 2005, at 3:40 PM, Sam Kalat wrote:

> I'm not sure if this is a quirk of bl2seq or in bioperl.  My task is
> to compare sequences that came from the same trace file, but were
> processed differently: with different basecallers, trimmers, screens,
> and the like.  I take two sequences at a time that come from the same
> source, and BLAST them against each other using StandAloneBlast with
> bl2seq.  I noticed in testing that I could take a sequence and BLAST
> it against itself, and frequently such a comparison isn't perfect -
> the fraction of identical bases might be somewhere in the 90's.
>
> On examination I see stuff like this (fake data shown):
>
> Query 1: ctgactgannnnnnnctgatcgatcgtacgtacg
> Sbjct 1: ctgactgatttttttctgatcgatcgtacgtacg
>
> The target was supposed to be the same as the subject, but anything
> that was an N becomes a T in the subject, but not the query, so they
> don't match up perfectly.  I don't know why T was chosen, but it is
> always T.
>
> Anyone know if this is intentional behavior?  Ultimately it means that
> all Ns in sequences treated this way are mismatches.  It seems weird
> to me because the sequence in question didn't have a string of Ts, and
> now anything that does have a string of Ts will be more likely to
> match.
>
> Code available on request, but it doesn't try to do anything out of
> the ordinary, and it runs w/o errors.
>
> Thanks in advance
> Sam Kalat
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list