[Biopython-dev] [Bug 2938] New: Bio.Entrez.read() returns empty string for HTML (not an error)

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Oct 27 11:50:03 EDT 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2938

           Summary: Bio.Entrez.read() returns empty string for HTML (not an
                    error)
           Product: Biopython
           Version: 1.52
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk


If given HTML instead of XML, Bio.Entrez.read() returns an empty string. I
would have expected a helpful error message.

 e.g.

>>> from Bio import Entrez
>>> handle = Entrez.efetch(db="pubmed", id="17206916")
>>> handle.readline()
'<html><head><title>PmFetch response</title></head><body>\n'

Try parsing this HTML as if it were XML ...

>>> handle = Entrez.efetch(db="pubmed", id="17206916")
>>> "" == Entrez.read(handle)
True

i.e. Entrez.read is returning an empty string.

Problem spotted based on a mailing list query, see this thread:
http://lists.open-bio.org/pipermail/biopython/2009-October/005774.html


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list