[Biopython] Error while parsing bgk file

Peter Cock p.j.a.cock at googlemail.com
Fri Jul 20 10:29:33 UTC 2012


On Fri, Jul 20, 2012 at 4:56 AM, ning luwen <bioinformaticsing at gmail.com> wrote:
> Hi Bow,
>
>       Thank you for your reply,  and a patch by lenna can solve the
> interruption of the parse.
>
>       ps: these gbk file was recently downloaded from
> ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/H_sapiens/ (with extension of
> gbs.gz), and the file contained "invalid GenBank annotation" is
> ftp://ftp.ncbi.nih.gov/refseq/H_sapiens/H_sapiens/CHR_02/hs_ref_GRCh37.p5_chr2.gbs.gz

Note the original bug report referred to a slightly different part/revision
of this chromosome, but it is the same issue reported earlier:
https://redmine.open-bio.org/issues/3175
ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_02/hs_ref_GRCh37.p2_chr2.gbk.gz

I have now committed Lenna's fix, which means this file now parses
with a warning about the problem features (which get None as their
location):

https://github.com/biopython/biopython/commit/bc733da09051ca53ad4515ac2d971ff0839a71ba
https://github.com/biopython/biopython/commit/4bf78f72682f0500e93c410f8108891dade88ff8

Ning, if you would like to test this fix the simplest way is to get the
latest source code from github, and reinstall Biopython. You can
either use the git tool at the command line, or the github URL for
a tarball: https://github.com/biopython/biopython/tarball/master

(Please ask if you need more guidance with this)

Regards,

Peter



More information about the Biopython mailing list