[Biopython-dev] [biopython] Missing DTD files (#260)

Michiel de Hoon mjldehoon at yahoo.com
Mon Dec 9 06:33:01 UTC 2013

Current we are using os.path.expanduser('~') /.biopython/Bio/Entrez/DTDs to look for locally stored DTDs.
This should work on Windows also.
Then I would suggest the following if a DTD file is missing:
1) Print a non-scary warning message that we will attempt to download the DTD;
2) Download the DTD;
3) Try to store it in the local DTD directory. If this fails (e.g. due to file permissions or whatnot), print another warning message;
4) Use the downloaded DTD to parse the XML.
Any final objections?


On Tue, 12/3/13, Peter Cock <p.j.a.cock at googlemail.com> wrote:

 Subject: Re: [Biopython-dev] [biopython] Missing DTD files (#260)
 To: "Michiel de Hoon" <mjldehoon at yahoo.com>
 Cc: "Biopython-Dev Mailing List" <biopython-dev at lists.open-bio.org>
 Date: Tuesday, December 3, 2013, 5:38 AM
 On Sun, Dec 1, 2013 at 3:28 AM,
 Michiel de Hoon <mjldehoon at yahoo.com>
 > How would people feel about Biopython always
 downloading DTD files
 > on the fly instead of distributing them with
 > After downloading and parsing a DTD file, we can keep
 it in memory
 > so we won't need to parse the same DTD file over and
 over again.
 > So the impact on speed will be minimal.
 > If we do so, we'll never run into the problem of
 missing DTD files. The
 > downside of course is that we will need internet access
 to parse any
 > XML file through Bio.Entrez. But maybe in today's world
 that is acceptable.
 Requiring network access would be annoying for offline work
 (e.g. how we usually run the automated tests), but most of
 NCBI Entrez XML files will (I expect) will be downloaded
 immediately parsed. So for usability this seems OK.
 Automatic caching to disk (without a scary warning) seems
 like a
 better idea than always downloading the DTD files on demand
 (which seems wasteful of bandwidth and more likely to give
 intermittent errors), although as you have noted before
 is the open question of where to put this files (including
 on Windows):

More information about the Biopython-dev mailing list