[Biopython-dev] [biopython] Missing DTD files (#260)
    Peter Cock 
    p.j.a.cock at googlemail.com
       
    Tue Dec  3 05:38:43 EST 2013
    
    
  
On Sun, Dec 1, 2013 at 3:28 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> How would people feel about Biopython always downloading DTD files
> on the fly instead of distributing them with Biopython?
>
> After downloading and parsing a DTD file, we can keep it in memory
> so we won't need to parse the same DTD file over and over again.
> So the impact on speed will be minimal.
>
> If we do so, we'll never run into the problem of missing DTD files. The
> downside of course is that we will need internet access to parse any
> XML file through Bio.Entrez. But maybe in today's world that is acceptable.
Requiring network access would be annoying for offline work
(e.g. how we usually run the automated tests), but most of the
NCBI Entrez XML files will (I expect) will be downloaded and
immediately parsed. So for usability this seems OK.
Automatic caching to disk (without a scary warning) seems like a
better idea than always downloading the DTD files on demand
(which seems wasteful of bandwidth and more likely to give
intermittent errors), although as you have noted before there
is the open question of where to put this files (including where
on Windows):
http://lists.open-bio.org/pipermail/biopython-dev/2010-October/008310.html
Regards,
Peter
    
    
More information about the Biopython-dev
mailing list