[Biopython] blastall - strange results

Stefanie Lück lueck at ipk-gatersleben.de
Thu May 28 07:55:30 UTC 2009


Hi!

The question is not really related to a biopython problem but nevertheless I 
want to be sure that I do everything correct.

I get strange results with blast.
My aim is to blast a query sequence, spitted to 21-mers, against a database. 
Since I need only 100 % matches of 21-mers, a set the word size parameter to 
21. Now, as a positive control, I took one EST sequence and made a database 
of it. Then I took 100 bp of that sequence, spitted to 21-mers and blast 
each of them against my DB.

Now I expect to get a full coverage (or better 80 hits because everything 
below 21 bp I don't blast) of hits because the sequence is fully present in 
the DB. Unfortunately blast finds much less (60-80 %, depending on the 
sequence).

Is this normal? I would expect to find all 21-mers. Why only some?

If I blast without to change the word size parameter its find all hits. But 
I would like to use this parameter because the blast is much faster and I 
don't need to take care about gaps etc. since I really need only 100 % 21 
mer matches.

Does someone have any ideas what could be the problem?

Thanks in advance!
Stefanie




More information about the Biopython mailing list