[Biopython] Moving from Bio.PubMed to Bio.Entrez

Fri Jul 17 09:29:31 EDT 2009

Hi Brad,
  thanks for clarification. I somewhat overlooked in the tutorial that
Entrez.read() requires me to ask for XML rettype and that it parses the XML
result by itself into the dictionary structure. Still I think it should
check what values I have passed down to Entrez.efetch() function. I know
it might be quite some work to keep it in sync with NCBI website but
let's see what others say. Either way, my code works now with Bio.Entrez
instead of the deprecated Bio.PubMed. I just had to quickly reinvent all
the exceptions because some PubMed entries lack authors, abbreviated
journal name, lack year, etc. ;-)
Best regards,
Martin

Brad Chapman wrote:
> Hi Martin;
> Thanks for the e-mail. Let's tackle your up to date 1.51beta work.
> 
>> When I upgrade to 1.51b I get slightly better results:
>>
>>>>> from Bio import Entrez, Medline, GenBank
>>>>> Entrez.email = "mmokrejs at iresite.org"
>>>>> _handle = Entrez.efetch(db="pubmed", id=10851087, retmode="text")
>>>>> _records = Entrez.read(_handle)
> [ error ]
> 
>>>>> _handle = Entrez.efetch(db="pubmed", id=10851087, retmode="XML")
>>>>> _records = Entrez.read(_handle)
>>>>> _records
> [ worked ]
> 
>>   Any clues what does that mean? TIA,
> 
> In the first (and also third) example, you are retrieving the text
> based result. The Entrez parser handles XML output, so it is
> complaining because it's getting the raw text record instead of XML. 
> 
> Your second example is correct and worked; you specified the correct
> XML retmode. You should be able to go with this.
> 
> More generally, since Entrez returns many different file types, you
> want to be sure and match up what you are getting with the parser
> you are using.