[Biopython] query upper limit for NCBIWWW.qblast?

Matthias Schade matthiasschade.de at googlemail.com
Thu Apr 11 09:20:31 UTC 2013


Hello everyone,

is there an upper limit to how many sequences I can query via 
NCBIWWW.qblast at once?

Sending up to 150 sequences each of 24mer length in a single string 
everything works fine. But now, I have tried the same for a string 
containing about 900 sequences. On good times, it takes the NCBI-server 
about 5min to send an answer. I save the answer and later open and parse 
the file by other functions in my code. However, even though I have 
queried the same 900 sequences, the resulting output-file varies in 
length (10 MB<x<20MB) and always at least misses the correct 
termination-tag in "<\BlastOutput>" or even misses more (this does not 
happen why querying 150 sequences or less).

I would guess once the server has started sending its answers, there 
might only be a limited time NCBIWWW.qblast waits for follow up packets 
... and thus depending on the current server-load, the 
NCBIWWW.qblast-function simply decides to terminate waiting for 
incomming data after some time, resulting in my blast-output-files to 
vary in length. Could anyone correct or verify this long-fetched hypothesis?

My core-lines are:

orgn='Mus Musculus' #on anything else
result = NCBIWWW.qblast("blastn", "nt", fasta_seq_string, expect=100, 
entrez_query=str(orgn+"[orgn]"))
save_file = open ('myblast_result.xml',"w")
save_file.write(result.read())

Best regards,
Matthias



More information about the Biopython mailing list