[Biopython] help with ncbiWWW

Wed Jul 26 12:37:14 UTC 2017

Hi,

*Disclaimer*: I haven't used ncbiWWW module.

*Suggestion 1*: if you're using a *NIX system. Can make use of Signals.
Wrap your call with the signal. Define the signal handler:

import signal
>

> class timeout:
>     def __init__(self, seconds=1, error_message='Timeout'):
>         self.seconds = seconds
>         self.error_message = error_message
>     def handle_timeout(self, signum, frame):
>         raise TimeoutError(self.error_message)
>     def __enter__(self):
>         signal.signal(signal.SIGALRM, self.handle_timeout)
>         signal.alarm(self.seconds)
>     def __exit__(self, type, value, traceback):
>         signal.alarm(0)

The above class will raise *TimeoutError *after the specified time
(seconds), to wrap your call:
try:
       with timeout(seconds=10):  # the time you expect as normal
             <your ncbiwww.qblast() call>
except TimeoutError as exc:
        <your second ncbiwww.qblast() call>
finally:
        <optional block>

*Suggestion 2*: using Multiprocessing or multithreading - for it, kindly
share your script/snippet.

*Suggestion 3*: Make direct API calls using '*requests*' package.
In case the API calls are simple (you can easily do so) use request to make
a call, with timeout flag, once the HTTP request will timeout it'll raise
Timeout exception, which you can catch and in that block make the second
call (which as per you, works perfectly fine):

try:
      resp = requests.get(blast_query_url, timeout=10.0).content
except requests.exceptions.*Timeout*:
      resp = requests.get(blast_query_url, timeout=10.0).content

After 10 seconds the call will raise Timeout, which is handled, and makes
the call again.

Cons:

   - Not much elegant - coding overhead.
   - What if the second request also take more than the timeout

Cheers,
Nabeel.

On Wed, Jul 26, 2017 at 4:26 PM, Pejvak Moghimi <pejvak.moghimi at york.ac.uk>
wrote:

> Hi all,
>
> I'm working on a script to to do blast using the ncbiWWW module, but very
> often one query sequence takes way too long, if at all, to return with
> results. This is always, in my experience, immediately solved if I just
> stop the script and re-run it (for the same query sequence); I get the
> results as quickly as I did for the other query sequences.
>
> I think this hints that this is something to do with ncbi servers. So, in
> order to tackle it, I need to simply modify my while loop to stop and
> re-run the "ncbiwww.qblast(..." line, if it takes longer than a reasonable
> length of time (I do understand shorter than a certain waiting-time would
> not be allowed by ncbi).
>
> I have no idea how to tackle this, except by either multithreading (not so
> sure how to go on about this though) or changing the qblast script (locally
> of course).
>
> I would really appreciate any help. Please do let me know if you would
> like to have a look at the script.
>
> Cheers,
> Pej.
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170726/588ea0d3/attachment-0001.html>