[Biopython] Question on Entrez.efetch and Full XML Articles

Peter Cock p.j.a.cock at googlemail.com
Thu Aug 20 16:19:36 UTC 2015


Hi Carrett,

Apologies for the delay and no-one answering. Have you solved this
yourself? Did you look at the batch download example in the tutorial?

This might be a problem with the start/end variables (not shown in
your example), or alternatively to do with how multiple records are
presented in the XML.

Peter

On Thu, Jun 4, 2015 at 10:40 PM, Garrett Z. Carver <gcarver at bowdoin.edu> wrote:
> Hello everyone,
>
> I'm trying to use Entrez.efetch to download full articles in xml format from
> PubMed Central. I am trying to download many articles in batches by keeping
> retmax constant and incrementing retstart. However, for each call to efetch,
> only one article is downloaded. The format seems to be correct and I am
> getting a full article, but am unable to retrieve more than one per function
> call.
>
> Here is a snippet from my code:
>
>         try:
>             fetch_handle = Entrez.efetch(db = "pmc", rettype = "",
> retmode="xml", retstart=start, retmax=end,webenv = searchResults["WebEnv"],
> query_key = searchResults["QueryKey"])
>         except urllib2.HTTPError as e:
>             print(e)
>             time.sleep(1.0/3.0)
>             continue
>         data = fetch_handle.read()
> fetch_handle.close()
>
> The data is then saved to a file on my desktop. This code worked well when
> modified to work with abstracts alone, downloading multiple abstracts with
> each efetch call.
>
> Any insight into this issue would be greatly appreciated. Please let me know
> if you need more info to evaluate the problem.
>
> Thanks,
> Garrett
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython


More information about the Biopython mailing list