[Biopython-dev] [biopython] Missing DTD files (#260)
Michiel de Hoon
mjldehoon at yahoo.com
Mon Dec 9 06:33:01 UTC 2013
Current we are using os.path.expanduser('~') /.biopython/Bio/Entrez/DTDs to look for locally stored DTDs.
This should work on Windows also.
Then I would suggest the following if a DTD file is missing:
1) Print a non-scary warning message that we will attempt to download the DTD;
2) Download the DTD;
3) Try to store it in the local DTD directory. If this fails (e.g. due to file permissions or whatnot), print another warning message;
4) Use the downloaded DTD to parse the XML.
Any final objections?
Best,
-Michiel.
--------------------------------------------
On Tue, 12/3/13, Peter Cock <p.j.a.cock at googlemail.com> wrote:
Subject: Re: [Biopython-dev] [biopython] Missing DTD files (#260)
To: "Michiel de Hoon" <mjldehoon at yahoo.com>
Cc: "Biopython-Dev Mailing List" <biopython-dev at lists.open-bio.org>
Date: Tuesday, December 3, 2013, 5:38 AM
On Sun, Dec 1, 2013 at 3:28 AM,
Michiel de Hoon <mjldehoon at yahoo.com>
wrote:
> How would people feel about Biopython always
downloading DTD files
> on the fly instead of distributing them with
Biopython?
>
> After downloading and parsing a DTD file, we can keep
it in memory
> so we won't need to parse the same DTD file over and
over again.
> So the impact on speed will be minimal.
>
> If we do so, we'll never run into the problem of
missing DTD files. The
> downside of course is that we will need internet access
to parse any
> XML file through Bio.Entrez. But maybe in today's world
that is acceptable.
Requiring network access would be annoying for offline work
(e.g. how we usually run the automated tests), but most of
the
NCBI Entrez XML files will (I expect) will be downloaded
and
immediately parsed. So for usability this seems OK.
Automatic caching to disk (without a scary warning) seems
like a
better idea than always downloading the DTD files on demand
(which seems wasteful of bandwidth and more likely to give
intermittent errors), although as you have noted before
there
is the open question of where to put this files (including
where
on Windows):
http://lists.open-bio.org/pipermail/biopython-dev/2010-October/008310.html
Regards,
Peter
More information about the Biopython-dev
mailing list