[Biopython-dev] [Bug 2678] Bio.Entrez module does not always retrieve or find DTD files
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Sat Dec 13 20:19:15 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2678
------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk 2008-12-13 15:19 EST -------
(In reply to comment #6)
> If the DTD is available locally in Bio/Entrez/DTDs, then Bio.Entrez will read
> it from there. If not, it tries to download it. This may fail if the servers
> are busy. If the needed DTDs are saved in Bio/Entrez/DTDs (and installed when
> Biopython is installed), you won't run into this problem.
I was just looking at this on my Windows XP Python 2.3 machine, and when it
tried to download missing DTD files it was just using a filename as the URL.
I've committed a fix to CVS which should resolve this:
biopython/Bio/Entrez/Parser.py revision 1.3
I'll double check this on Linux/Mac next week.
This may be related to Leighton's problem - although 'xhtml1-strict.dtd' and
'xhtml-lat1.ent' are not NCBI DTD files, but rather a part of the XML
specification itself.
Note that if I delete all the Bio/Entrez/DTDs/* files, then test_Entrez.py
fails. I get warning messages about downloading missing DTD files, and the
following failures:
======================================================================
ERROR: Test parsing pubmed links returned by ELink (fifth test)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_Entrez.py", line 2523, in t_pubmed5
record = Entrez.read(input)
File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read
record = handler.run(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
self.parser.ParseFile(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 131, in
startE
lement
if object!="":
UnboundLocalError: local variable 'object' referenced before assignment
======================================================================
ERROR: Test parsing XML returned by EFetch, PubMed database (first test)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_Entrez.py", line 3058, in t_pubmed1
record = Entrez.read(input)
File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read
record = handler.run(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
self.parser.ParseFile(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
parser.ParseFile(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
parser.ParseFile(handle)
ExpatError: syntax error: line 1, column 0
======================================================================
ERROR: Test parsing XML returned by EFetch, PubMed database (second test)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_Entrez.py", line 3261, in t_pubmed2
record = Entrez.read(input)
File "c:\python23\Lib\site-packages\Bio\Entrez\__init__.py", line 286, in
read
record = handler.run(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 95, in run
self.parser.ParseFile(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
parser.ParseFile(handle)
File "c:\python23\Lib\site-packages\Bio\Entrez\Parser.py", line 294, in
extern
al_entity_ref_handler
parser.ParseFile(handle)
ExpatError: syntax error: line 1, column 0
======================================================================
FAIL: Test parsing pubmed links returned by ELink (sixth test)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_Entrez.py", line 2697, in t_pubmed6
assert len(record[0]["IdCheckList"])==2
AssertionError
----------------------------------------------------------------------
(The rest of the Entrez tests pass even with the missing DTDs - they are now
successfully downloaded on demand)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list