[Biopython] help with ncbiWWW

Nabeel Ahmed chaudhrynabeelahmed at gmail.com
Wed Jul 26 15:58:21 UTC 2017


> > Suggestion 3: Make direct API calls using 'requests' package.
> > In case the API calls are simple (you can easily do so) use request to
> make
> > a call, with timeout flag, once the HTTP request will timeout it'll raise
> > Timeout exception, which you can catch and in that block make the second
> > call (which as per you, works perfectly fine)


> This is essentially the idea I was initially suggesting, but the problem
> isn't actually in the online request (currently done by urlopen).
> With the NCBI BLAST you typically submit a query, wait, check for
> progress, wait (repeat), and then download the results. This loop in
> Biopython has no timeout - it relies on the NCBI returning results
> eventually - or giving an error.


 Yh, what you're saying is, the response in *qblast *call depends on lot of
factors, and varies from a job to job i.e.
data size, NCBI servers' response time, etc.
Given this, there isn't any point in having a timeout param for this call.

But, in case in case, it's required, can patch *qblast* with a new param '
*timeout*', and I have looked into the source code (for Python 2.x) it's using
*urlopen* from urllib2 package
<http://biopython.org/DIST/docs/api/Bio._py3k-pysrc.html> (line 172)
This *urlopen *allows *timeout*
<https://docs.python.org/2/library/urllib2.html#urllib2.urlopen> as an
argument.
Can have an optional param - timeout=0 for qblast, and pass it to urlopen:

handle = _urlopen(request, timeout=timeout)  - Line 132 and 176
<http://biopython.org/DIST/docs/api/Bio.Blast.NCBIWWW-pysrc.html>


On Wed, Jul 26, 2017 at 8:18 PM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> On Wed, Jul 26, 2017 at 1:37 PM, Nabeel Ahmed
> <chaudhrynabeelahmed at gmail.com> wrote:
> > Hi,
> >
> > Disclaimer: I haven't used ncbiWWW module.
> >
> > Suggestion 1: if you're using a *NIX system. Can make use of Signals.
> Wrap
> > your call with the signal. Define the signal handler:
>
> I think that approach would work here - thanks!
>
> > Suggestion 2: using Multiprocessing or multithreading - for it, kindly
> share
> > your script/snippet.
>
> Again that would likely work, but will be more complicated.
>
> > Suggestion 3: Make direct API calls using 'requests' package.
> > In case the API calls are simple (you can easily do so) use request to
> make
> > a call, with timeout flag, once the HTTP request will timeout it'll raise
> > Timeout exception, which you can catch and in that block make the second
> > call (which as per you, works perfectly fine):
>
> This is essentially the idea I was initially suggesting, but the problem
> isn't actually in the online request (currently done by urlopen).
>
> With the NCBI BLAST you typically submit a query, wait, check for
> progress, wait (repeat), and then download the results. This loop in
> Biopython has no timeout - it relies on the NCBI returning results
> eventually - or giving an error.
>
> Peter
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170726/53f0b427/attachment-0001.html>


More information about the Biopython mailing list