[Biopython] Question on Entrez.efetch and Full XML Articles

Garrett Z. Carver gcarver at bowdoin.edu
Thu Jun 4 21:40:16 UTC 2015


Hello everyone,

I'm trying to use Entrez.efetch to download full articles in xml format from PubMed Central. I am trying to download many articles in batches by keeping retmax constant and incrementing retstart. However, for each call to efetch, only one article is downloaded. The format seems to be correct and I am getting a full article, but am unable to retrieve more than one per function call.

Here is a snippet from my code:

        try:
            fetch_handle = Entrez.efetch(db = "pmc", rettype = "", retmode="xml", retstart=start, retmax=end,webenv = searchResults["WebEnv"], query_key = searchResults["QueryKey"])
        except urllib2.HTTPError as e:
            print(e)
            time.sleep(1.0/3.0)
            continue
        data = fetch_handle.read()
fetch_handle.close()

The data is then saved to a file on my desktop. This code worked well when modified to work with abstracts alone, downloading multiple abstracts with each efetch call.

Any insight into this issue would be greatly appreciated. Please let me know if you need more info to evaluate the problem.

Thanks,
Garrett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150604/8425f8cd/attachment.html>


More information about the Biopython mailing list