[Biopython] save efetch results in different files
Silvio Tschapke
silvio.tschapke at googlemail.com
Wed Apr 28 09:24:25 UTC 2010
Hi all,
I'd like to download hundreds of pubmed entries in one turn, but save every
entry in a single file for further processing with e.g. NLTK.
Is this possible? Or what is the common way to do this? Or do I have to call
efetch for every single pmid? I dont know how.
Could you also explain me what handle.read() does? Entrez.read(handle) I
understand, because it is documented, but handle.read() not. What kind of
type is a handle?
search_results = Entrez.read(Entrez.esearch(db="pubmed",
term="Biopython",
usehistory="y"))
batch_size = 10
for start in range(0,count,batch_size):
end = min(count, start+batch_size)
print "Going to download record %i to %i" % (start+1, end)
fetch_handle = Entrez.efetch(db="pubmed", rettype="xml",
retstart=start, retmax=batch_size,
webenv=search_results["WebEnv"],
query_key=search_results["QueryKey"])
for pmid in search_results["IdList"]:
out_handle = open(pmid+".txt", "w")
HERE I HAVE TO ACCESS THE ENTRY FROM THE fetch_handle FOR THE
CORRESPONDING pmid
#data = Entrez.read(fetch_handle)
#data = fetch_handle.read()
fetch_handle.close()
out_handle.write(data)
out_handle.close()
Cheers,
Silvio
More information about the Biopython
mailing list