[Biopython] blastall - strange results

lueck at ipk-gatersleben.de lueck at ipk-gatersleben.de
Wed Jul 8 10:08:56 UTC 2009


Hi!

Sorry for the late replay but here is an update:

I tried megablast but it doesn't help...But what I found out and is 
acceptable for the moment:

If the query sequence is >235 bp
   >>> use wordsize 21

If the query sequence is <235 bp
   >>> use wordsize 11

I don't know the reason for that but at least I can work with it. 
However now and than BLAST don't find all sequences (rarely) and soon 
or later I'll switch to a short read aligner or global alignment.
Kind regards
Stefanie

>>>
On Thu, May 28, 2009 at 1:02 PM, Brad Chapman <[EMAIL PROTECTED]> wrote:
> Hi Stefanie;
>
>> I get strange results with blast.
>> My aim is to blast a query sequence, spitted to 21-mers, against a database.
> [...]
>> Is this normal? I would expect to find all 21-mers. Why only some?

I would check the filtering option is off (by default BLAST will mask low
complexity regions).

> BLAST isn't the best tool for this sort of problem. For exhaustively
> aligning short sequences to a database of target sequences, you
> should think about using a short read aligner. This is a nice
> summary of available aligners:
>
> http://www.sanger.ac.uk/Users/lh3/NGSalign.shtml
>
> Personally, I have had good experiences using Mosaik and Bowtie.
>
> Hope this helps,
> Brad

Brad is probably right about normal BLAST not being the best tool.

However, if you haven't done so already you might want to try
megablast instead of blastn, as this is designed for very similar
matches. This should be a very small change to your existing Biopython
script, so it should be easy to try out.

Peter
_______________________________________________
Biopython mailing list  -  [EMAIL PROTECTED]
http://lists.open-bio.org/mailman/listinfo/biopython




More information about the Biopython mailing list