[Biopython-dev] NCBIWWW.qblast: Question about expected run time and time outs

Mon Jun 22 14:49:34 UTC 2015

Hi Peter,

Unfortunately, I think that might not be an option for me. The software I'm
trying to write is meant to be an open-source tool that researchers would
just be able to use without extensive set up required. I'm afraid that I
won't be able to ask people to install standalone BLAST and learn to use
computer clusters, without losing the accessibility of the program.

Do you think that this is really the BLAST server being busy? As I said, I
didn't have any problems for a long time. The average time to get a BLAST
result back would be about 6-7 minutes. Now I just don't get through.

Best,
Lev

On Mon, Jun 22, 2015 at 4:03 AM, Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Hi Lev,
>
> My usual advice when dealing with any large-scale BLAST
> search is to download the NCBI database and use standalone
> BLAST+ locally, rather than the NCBI web-service which
> can be busy - especially during USA working hours.
>
> Do you have access to a local Linux cluster or similar? It is
> very likely there are people in your department/university
> already doing this - often the SysAdmin will keep a single
> shared copy of the databases up to date for everyone to
> use.
>
> (You would likely need to do some post-filtering to remove
> any Ciliata hits since the Entrez query option is only available
> when running BLAST at the NCBI.)
>
> Peter
>
> On Sun, Jun 21, 2015 at 7:19 PM, Lev Tsypin <ltsypin at uchicago.edu> wrote:
> > Hello everyone,
> >
> > I have been writing a tool that makes use of Biopython for automatic
> BLAST
> > searches--your libraries have made my life so much easier! I really
> > appreciate your work. I've recently begun to run into some trouble,
> though,
> > and I am not quite sure how to explain it, or respond to it, so I wanted
> to
> > ask for advice:
> >
> > The issue is that, of late, when I call the NCBIWWW.qblast function, it
> > takes forever--literally never finishing. Before, there were sometimes
> cases
> > that it would get stuck for a long time (up to an hour or so), but it
> would
> > then manage to fight through whatever obstacle and go on. In such cases,
> I
> > also found that if I were to artificially restart the request, the
> function
> > would rouse itself and go much better. Here's an example of a function
> call:
> >
> > blastp_result = NCBIWWW.qblast('blastp', 'nr',
> >
> 'MSLSREENIYMGKISEQTERFEDMLEYMKKVVQTGQELSVEERNLLSVAYKNTVGSRRSAWRSISAIQQKEESKGSKHLDLLTNYKKKIETELNLYCEDILRLLNDYLIKNATNAEAQVFFLKMKGDYYRYIAEYAQGDDHKKAADGALDSYNKASEIANSELSTTHPIRLGLALNFSVFHYEVLNDPSKACTLAKTAFDEAIGDIERIQEDQYKDATTIMQLIRDNLTLWTSEFQDDAEEQQE',
> > entrez_query = 'NOT Ciliata').read()
> >
> > [In the protein sequence above I have multiple lines so that it fits in
> the
> > email, but when I normally run the function I don't have any newline
> > characters or anything, of course]
> >
> > My questions are the following: Why does the function sometimes get stuck
> > for so long, and what should I do now that it never seems to work
> anymore?
> > Do you have any suggestions for introducing a 'time out' so that if, for
> > example, the request takes longer than 10 minutes, it would automatically
> > retry? I know there is an optional parameter in the urllib2 library for a
> > time out, but, looking at the source code for NCBIWWW.qblast(), it wasn't
> > obvious to me whether and how it would work to use it.
> >
> > Thank you very much for any advice.
> >
> > Best regards,
> > Lev
> >
> > _______________________________________________
> > Biopython-dev mailing list
> > Biopython-dev at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20150622/b4c8a6b4/attachment.html>