[Biopython] Parsing large blast files
Stefanie Lück
lueck at ipk-gatersleben.de
Tue Apr 28 08:23:02 UTC 2009
Thanks Peter!
>You could set the expectation threshold (I don't think there is an
>identity threshold which would be ideal for your example).
I can't say what will be the expectation treshold. This won't work.
>If you only want the single BEST hit for a query, set the number of
>alignments and/or descriptions to show to just one (these do different
>things in the plain text output - maybe for XML output you only need
>to limit the number of alignments). This should give a much smaller
>file, which will be fast to parse.
This is to risky. There might be several 100 % hits which I need.
>Finally, and perhaps most importantly - don't do an individual BLAST
>query for each record. Instead, prepare a FASTA file of ALL your
>queries, and use that as the input to BLAST. This way there is only
>one command line call, and the BLAST database is only loaded into
>memory once.
Cool, I didn't know that this will work! Great, that's very nice! 50 % time
speed up!
Thanks Peter and have a nice day!
Stefanie
More information about the Biopython
mailing list