[Biopython] Skipping over blank/erroneous Entrez.esummary() results

Michiel de Hoon mjldehoon at yahoo.com
Wed Oct 7 08:19:01 EDT 2009


> In addition to Michiel's workaround, I checked in a small
> change
> which could at least circumvent the error you are
> reporting:
> 
> http://github.com/biopython/biopython/commit/4dca8a24f62a1c28556d4e58f34db66f4b099279

Sorry, but that change introduces two bugs. First, we should be able to distinguish between -1 and missing values. More importantly, we want to be able to add attributes to value. Since -1 is an integer instead of an object, it won't allow that.

Can you revert this change?

--Michiel

--- On Wed, 10/7/09, Brad Chapman <chapmanb at 50mail.com> wrote:

> From: Brad Chapman <chapmanb at 50mail.com>
> Subject: Re: [Biopython] Skipping over blank/erroneous Entrez.esummary() results
> To: "Austin Davis-Richardson" <harekrishna at gmail.com>
> Cc: biopython at lists.open-bio.org
> Date: Wednesday, October 7, 2009, 7:17 AM
> Hi Austin;
> 
> > I'm using BioPython to generate a table of accession
> numbers and their
> > corresponding TaxIDs.  The fastest way I can do
> this is 20 at a time
> > (20 per 3 seconds rather than 1 per 3 seconds).
> > 
> > However, this results in a problem.
> > 
> > whenever my script receives a result from NCBI that is
> blank such as
> > there being no value for TaxID, BioPython crashes with
> the error:
> > 
> >   File "taxcollector3.py", line 39, in
> getTaxID
> >     record = Entrez.read(handle)
> >   File
> "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/__init__.py",
> > line 259, in read
> >     record = handler.run(handle)
> >   File
> "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/Parser.py",
> > line 90, in run
> >     self.parser.ParseFile(handle)
> >   File
> "/Users/audy/Downloads/biopython-1.52/build/lib.macosx-10.6-universal-2.6/Bio/Entrez/Parser.py",
> > line 191, in endElement
> >     value = IntegerElement(value)
> > ValueError: invalid literal for int() with base 10:
> ''
> 
> In addition to Michiel's workaround, I checked in a small
> change
> which could at least circumvent the error you are
> reporting:
> 
> http://github.com/biopython/biopython/commit/4dca8a24f62a1c28556d4e58f34db66f4b099279
> 
> It affects only one file, so if you don't want to pull the
> latest
> from GitHub, you can download just that file and replace it
> in your
> Biopython library:
> 
> http://github.com/biopython/biopython/blob/master/Bio/Entrez/Parser.py
> 
> Ideally, we should have a test case to cover this. Could
> you let us
> know specific GIs that are causing the problem? The group
> of 20 is
> fine if you haven't narrowed it further than that. This'll
> also help
> us check if there are any other problems with these
> records.
> 
> Thanks for reporting this,
> Brad
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 


      



More information about the Biopython mailing list