[Biopython] save efetch results in different files

Silvio Tschapke silvio.tschapke at googlemail.com
Wed Apr 28 09:24:25 UTC 2010


Hi all,

I'd like to download hundreds of pubmed entries in one turn, but save every
entry in a single file for further processing with e.g. NLTK.
Is this possible? Or what is the common way to do this? Or do I have to call
efetch for every single pmid? I dont know how.
Could you also explain me what handle.read() does? Entrez.read(handle) I
understand, because it is documented, but handle.read() not. What kind of
type is a handle?


search_results = Entrez.read(Entrez.esearch(db="pubmed",
                                            term="Biopython",
                                            usehistory="y"))

batch_size = 10


for start in range(0,count,batch_size):
    end = min(count, start+batch_size)
    print "Going to download record %i to %i" % (start+1, end)
    fetch_handle = Entrez.efetch(db="pubmed", rettype="xml",
                                 retstart=start, retmax=batch_size,
                                 webenv=search_results["WebEnv"],
                                 query_key=search_results["QueryKey"])


for pmid in search_results["IdList"]:
    out_handle = open(pmid+".txt", "w")
    HERE I HAVE TO ACCESS THE ENTRY FROM THE fetch_handle FOR THE
CORRESPONDING pmid

    #data = Entrez.read(fetch_handle)
    #data = fetch_handle.read()
    fetch_handle.close()
    out_handle.write(data)
    out_handle.close()



Cheers,
Silvio



More information about the Biopython mailing list