[Biopython] Problem with Bio.Entrez...
Michiel de Hoon
mjldehoon at yahoo.com
Sat Aug 28 07:47:25 UTC 2010
I think that for Biopython release 1.55, we should silently ignore this error using the fix proposed by Nathan. After the release is out, I suggest we add an optional argument "validate" to the read() and parse() function, defaulting to True. If validate is True, then read()/parse() raise an error if it finds elements in the XML that are not represented in the DTD. If validate is False, then such elements are silently ignored. This will require some other minor changes to the parser, so I'd like to do this after 1.55 is out.
--Michiel.
--- On Fri, 8/27/10, Nathan Edwards <nje5 at georgetown.edu> wrote:
> From: Nathan Edwards <nje5 at georgetown.edu>
> Subject: Re: [Biopython] Problem with Bio.Entrez...
> To: "Peter" <biopython at maubp.freeserve.co.uk>
> Cc: "Michiel de Hoon" <mjldehoon at yahoo.com>, biopython at lists.open-bio.org
> Date: Friday, August 27, 2010, 12:10 PM
>
> > I'd suggest issuing a warning for the bad element,
> rather than silently
> > ignoring it.
>
> Except that I find the warning infrastructure pretty hard
> to use when I need to turn something off.
>
> Other alternatives providing a switch to choose between
> throwing an exception (probably the default) or silently
> coping.
>
> >> And, given the frequency with which NCBI seems to
> break these things,
> >> I _do_ prefer the "ignore it" strategy, if it
> works. :-)
> >
> > Nathan - have you notified the NCBI about this? I
> assume you would get
> > an error putting the XML through a validator - if you
> haven't already done
> > so that would be worthwhile. Or would you rather one
> of us contact them?
>
> Yes, NCBI has been notified and I just received a response
> that the developers have been notified.
>
> > I will bring this to our developers' attention.
> However, I will not
> > be able to provide your with any ETA of the
> correction/fix or other
> > comparable actions at this time.
>
> Visual inspection is sufficient to verify the element is
> not mentioned in the DTD, though I could fire up a
> validating parser I guess. Just did and it confirms the
> error.
>
> - n
>
> -- Dr. Nathan Edwards
> nje5 at georgetown.edu
> Department of Biochemistry and Molecular & Cellular
> Biology
> Georgetown
> University Medical Center
> Room 1215, Harris Building
> Room 347, Basic Science
> 3300 Whitehaven St, NW
> 3900 Reservoir Road, NW
> Washington DC 20007
> Washington DC
> 20007
> Phone: 202-687-7042
> Phone:
> 202-687-1618
> Fax: 202-687-0057
> Fax:
> 202-687-7186
>
More information about the Biopython
mailing list