[Biopython] [Entrez/eFetch] "reasonable" package

Peter Cock p.j.a.cock at googlemail.com
Wed Dec 2 22:04:22 UTC 2015


Hi,

Currently Biopython does not attempt to do anything about
limiting retmax on your behalf.  The suggested retmax limit of 500
is probably specific to that database and/or file format (or so I
would imagine - some records like uilists are tiny in comparison).

Are you using the results as XML? It probably is possible to
merge the XML files, but it might be more hassle that its worth.

I would suggest a double loop ought to work fine - loop over
the collection of XML files, and then for each file loop over the
records returned from the parser.

Regards,

Peter

On Wed, Dec 2, 2015 at 9:39 PM, <c.buhtz at posteo.jp> wrote:

> I asked the Entrez support how should I tread the servers resources
> with "respect". :)
>
> First answer was without discrete numbers but in the second one they
> told me asking for 500 (retmax for eSearch) is a "reasonable" value
> because the eBot (a tool they offer on their website) use it, too.
>
> No I have nearly 13.000 PIDs I want to fetch their article infos via
> eFetch. It is a lot. ;)
>
> But I am not sure how to do that with biopython. When I separate that
> in 500-packages I would have 26 different record objects back.
> I don't like that. I would prefer one big record object I can analyse.
>
> Do you see a way to merge this record objects. Or maybe there is
> another way for that?
> Or does Biopython.Entrez still handle that problem internal (like the
> only-3-per-second-querys-rule or the HTTP-POST-decision)?
>
> Any suggestions?
> --
> GnuPGP-Key ID 0751A8EC
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20151202/fc5b20e3/attachment.html>


More information about the Biopython mailing list