[Biopython] help with ncbiWWW

Jocelyne jocelyne at gmail.com
Wed Jul 26 21:02:27 UTC 2017


I had the issue on Linux, but it was indeed Anaconda. The issue was
frequent enough I had to write the try catch loop, so it should be
reproducible if you are willing to submit enough requests (unless the NCBI
server suddenly decides to cooperate).

But indeed, it would need to be changed within the biopython library. You
are in the correct place. It hung on urlopen. The code went something like
this
while try_counter < max_tries:
  try:
     response = urllib2.urlopen(request, timeout=timeout)
  except timeoutexception:
     increase try_counter
  else:
     break #success!


On Jul 26, 2017 13:42, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

That does help, thank you.

First of all that tells me you are using Windows and your Python is
from Anaconda (probably not important here).

Now, I had been guessing the code was getting stuck while actually
connecting to the NCBI and waiting an update - which is where that
socket timeout would come into play.

I see now the problem is when Biopython checks for an update,
waits for a bit, checks for an update, waits for a bit, ... and never
gives up:

https://github.com/biopython/biopython/blob/biopython-170/Bi
o/Blast/NCBIWWW.py#L164

The code increases the wait interval to 120s (two minutes), but
currently has no (optional) maximum total waiting time. Adding
this as an option seems sensible (e.g. a maximum total waiting
time of say 5 or 10 mins).

Also, it would be good to check if the NCBI is returning some
clue or error message which our code does not understand...

>From your initial description is sounds like you have not found
any single example which fails - so this is going to be hard to
test.

Peter

On Wed, Jul 26, 2017 at 3:04 PM, Pejvak Moghimi
<pejvak.moghimi at york.ac.uk> wrote:
> Hi Peter,
>
> Here it is:
>
> Traceback (most recent call last):
>
>   File "<ipython-input-107-561cd74d2097>", line 1, in <module>
>     runfile('D:/Dropbox/Pejvak
> Moghimi/DMT_project/blast_for_clav_seqs/blastScript(altered).py',
> wdir='D:/Dropbox/Pejvak Moghimi/DMT_project/blast_for_clav_seqs')
>
>   File
> "C:\Users\pezhv\Anaconda3\lib\site-packages\spyder\utils\sit
e\sitecustomize.py",
> line 880, in runfile
>     execfile(filename, namespace)
>
>   File
> "C:\Users\pezhv\Anaconda3\lib\site-packages\spyder\utils\sit
e\sitecustomize.py",
> line 102, in execfile
>     exec(compile(f.read(), filename, 'exec'), namespace)
>
>   File "D:/Dropbox/Pejvak
> Moghimi/DMT_project/blast_for_clav_seqs/blastScript(altered).py", line
116,
> in <module>
>     result_handle = NCBIWWW.qblast("blastp", "nr", sequence,
> hitlist_size=500, entrez_query = orgn_specified)
>
>   File "C:\Users\pezhv\Anaconda3\lib\site-packages\Bio\Blast\NCBIWWW.py",
> line 164, in qblast
>     time.sleep(wait)
>
>
> Cheers,
> Pej.
>
>
> On 26 July 2017 at 14:57, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>
>> Hi Pej.
>>
>> Hmm. Maybe setting the timeout is not going to solve your
>> problem. I was hoping that would be a neat solution.
>>
>> Can you show us the stack trace when you had to stop a job
>> please?
>>
>> I assume you are using control+c to do this, in which case
>> Python ought to stop with the exception KeyboardInterrupt.
>> What I am interested in here is where in the code Python
>> is getting stuck. That would be a good clue.
>>
>> Peter
>>
>> On Wed, Jul 26, 2017 at 2:47 PM, Pejvak Moghimi
>> <pejvak.moghimi at york.ac.uk> wrote:
>> > Hi Peter,
>> >
>> > That solution, so far, does not seem to have worked nor with 10 neither
>> > with
>> > 30 second options.
>> >
>> > Cheers,
>> > Pej.
>> >
>> > On 26 July 2017 at 13:29, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> >>
>> >> I am hoping that putting this near the start of your script will
>> >> apply the default timeout to all your BLAST calls (or other
>> >> network calls, e.g. NCBI Entrez):
>> >>
>> >> import socket
>> >> socket.setdefaulttimeout(30)  # timeout in seconds
>> >>
>> >> Peter
>
>
_______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/biopython
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170726/6257a6a8/attachment-0001.html>


More information about the Biopython mailing list