[Biopython] Parsing large blast files

Peter Cock p.j.a.cock at googlemail.com
Tue Apr 28 06:16:53 EDT 2009


On Tue, Apr 28, 2009 at 11:05 AM, Stefanie Lück
<lueck at ipk-gatersleben.de> wrote:
>> Only a 50% time speed up? i.e. It took half the time?  Not bad,
>> although I expected more.  It will probably depend on the number of
>> queries, their sizes, and the database - probably the speed up would
>> be more for a larger database like NR.
>
> I blast ~3000 queries against the tigr barley v9 DB (50500 subjects). It
> takes about 35 seconds with XP, E8400 (3GHZ), 4 GB RAM. Hope this is
> normal...

35s sounds good :)

I normally deal with much slower searches (e.g. protein against NR, or
with RPS-BLAST against CDD), measured in minutes or when querying
whole genomes, maybe hours.  On this sort of problem I would expect
doing individual searches for each query to be much much slower.

You are dealing with a much smaller database, and with shorter
queries, so it will in general be faster.

Peter



More information about the Biopython mailing list