[Biopython] Entrez.efetch bug?

Peter biopython at maubp.freeserve.co.uk
Thu Apr 15 11:31:28 EDT 2010


On Thu, Apr 15, 2010 at 4:15 PM, Morten Kjeldgaard <mok at bioxray.dk> wrote:
> Hi,
>
> I am getting an error with Entrez.efetch() with Biopython version 1.51. This
> is my handle:
>
> handle = Entrez.efetch(db='protein', id='114391',rettype='gp')
>

In the above, you've asked Entrez to give you a plain text GenPept file
(a protein GenBank file).

> When I subsequently do this:
>
>  record = Entrez.read(handle)
>
> I get a syntax error from Expat:
>
> ExpatError: syntax error: line 1, column 0
>

The Bio.Entrez.read() and Bio.Entrez.parse() functions expect XML.

> However, if I do the following, it works:
>
> record = handle.read()

Well, yes, you get a big string stored as the variable record.

> but then I need to parse the resulting record using the Genbank parser,
> which is a nuisance since I normally should get this for free from the
> Entrez module.
>
> Comments, anyone?

Try this:

from Bio import Entrez
from Bio import SeqIO
handle = Entrez.efetch(db='protein', id='114391',rettype='gp')
record = SeqIO.read(handle, 'genbank')

Peter



More information about the Biopython mailing list