[Biopython] help with ncbiWWW

Pejvak Moghimi pejvak.moghimi at york.ac.uk
Wed Jul 26 16:07:09 UTC 2017


I thought about adding a timeout parameter to qblast too, and to me it
seems like the most straight forward solution.

On 26 July 2017 at 16:58, Nabeel Ahmed <chaudhrynabeelahmed at gmail.com>
wrote:

>
>
>
>
>
>> > Suggestion 3: Make direct API calls using 'requests' package.
>> > In case the API calls are simple (you can easily do so) use request to
>> make
>> > a call, with timeout flag, once the HTTP request will timeout it'll
>> raise
>> > Timeout exception, which you can catch and in that block make the second
>> > call (which as per you, works perfectly fine)
>
>
>> This is essentially the idea I was initially suggesting, but the problem
>> isn't actually in the online request (currently done by urlopen).
>> With the NCBI BLAST you typically submit a query, wait, check for
>> progress, wait (repeat), and then download the results. This loop in
>> Biopython has no timeout - it relies on the NCBI returning results
>> eventually - or giving an error.
>
>
>  Yh, what you're saying is, the response in *qblast *call depends on lot
> of factors, and varies from a job to job i.e.
> data size, NCBI servers' response time, etc.
> Given this, there isn't any point in having a timeout param for this call.
>
> But, in case in case, it's required, can patch *qblast* with a new param '
> *timeout*', and I have looked into the source code (for Python 2.x) it's using
> *urlopen* from urllib2 package
> <http://biopython.org/DIST/docs/api/Bio._py3k-pysrc.html> (line 172)
> This *urlopen *allows *timeout*
> <https://docs.python.org/2/library/urllib2.html#urllib2.urlopen> as an
> argument.
> Can have an optional param - timeout=0 for qblast, and pass it to urlopen:
>
> handle = _urlopen(request, timeout=timeout)  - Line 132 and 176
> <http://biopython.org/DIST/docs/api/Bio.Blast.NCBIWWW-pysrc.html>
>
>
> On Wed, Jul 26, 2017 at 8:18 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
>
>> On Wed, Jul 26, 2017 at 1:37 PM, Nabeel Ahmed
>> <chaudhrynabeelahmed at gmail.com> wrote:
>> > Hi,
>> >
>> > Disclaimer: I haven't used ncbiWWW module.
>> >
>> > Suggestion 1: if you're using a *NIX system. Can make use of Signals.
>> Wrap
>> > your call with the signal. Define the signal handler:
>>
>> I think that approach would work here - thanks!
>>
>> > Suggestion 2: using Multiprocessing or multithreading - for it, kindly
>> share
>> > your script/snippet.
>>
>> Again that would likely work, but will be more complicated.
>>
>> > Suggestion 3: Make direct API calls using 'requests' package.
>> > In case the API calls are simple (you can easily do so) use request to
>> make
>> > a call, with timeout flag, once the HTTP request will timeout it'll
>> raise
>> > Timeout exception, which you can catch and in that block make the second
>> > call (which as per you, works perfectly fine):
>>
>> This is essentially the idea I was initially suggesting, but the problem
>> isn't actually in the online request (currently done by urlopen).
>>
>> With the NCBI BLAST you typically submit a query, wait, check for
>> progress, wait (repeat), and then download the results. This loop in
>> Biopython has no timeout - it relies on the NCBI returning results
>> eventually - or giving an error.
>>
>> Peter
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170726/a6d0f75e/attachment.html>


More information about the Biopython mailing list