[Biopython-dev] [Bug 2678] Bio.Entrez module does not always retrieve or find DTD files

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sat Dec 13 20:19:15 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2678





------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk  2008-12-13 15:19 EST -------
(In reply to comment #6)
> If the DTD is available locally in Bio/Entrez/DTDs, then Bio.Entrez will read
> it from there. If not, it tries to download it. This may fail if the servers
> are busy. If the needed DTDs are saved in Bio/Entrez/DTDs (and installed when
> Biopython is installed), you won't run into this problem.

I was just looking at this on my Windows XP Python 2.3 machine, and when it
tried to download missing DTD files it was just using a filename as the URL.
I've committed a fix to CVS which should resolve this:

biopython/Bio/Entrez/Parser.py revision 1.3

I'll double check this on Linux/Mac next week.

This may be related to Leighton's problem - although 'xhtml1-strict.dtd' and
'xhtml-lat1.ent' are not NCBI DTD files, but rather a part of the XML
specification itself.

Note that if I delete all the Bio/Entrez/DTDs/* files, then test_Entrez.py
fails.  I get warning messages about downloading missing DTD files, and the
following failures:

======================================================================
ERROR: Test parsing pubmed links returned by ELink (fifth test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_Entrez.py", line 2523, in t_pubmed5
    record = Entrez.read(input)
  File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read

    record = handler.run(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
    self.parser.ParseFile(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 131, in
startE
lement
    if object!="":
UnboundLocalError: local variable 'object' referenced before assignment

======================================================================
ERROR: Test parsing XML returned by EFetch, PubMed database (first test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_Entrez.py", line 3058, in t_pubmed1
    record = Entrez.read(input)
  File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read

    record = handler.run(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
    self.parser.ParseFile(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
    parser.ParseFile(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
    parser.ParseFile(handle)
ExpatError: syntax error: line 1, column 0

======================================================================
ERROR: Test parsing XML returned by EFetch, PubMed database (second test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_Entrez.py", line 3261, in t_pubmed2
    record = Entrez.read(input)
  File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read

    record = handler.run(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
    self.parser.ParseFile(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
    parser.ParseFile(handle)
  File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
    parser.ParseFile(handle)
ExpatError: syntax error: line 1, column 0

======================================================================
FAIL: Test parsing pubmed links returned by ELink (sixth test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_Entrez.py", line 2697, in t_pubmed6
    assert len(record[0]["IdCheckList"])==2
AssertionError

----------------------------------------------------------------------

(The rest of the Entrez tests pass even with the missing DTDs - they are now
successfully downloaded on demand)


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list