[Bioperl-l] Bug in genbank parsing: CONTIG gaps

Chris Fields cjfields at uiuc.edu
Thu May 4 18:40:32 UTC 2006


Are you using the CONTIG record or the full GenBank file? 	I see
problems with both (using bioperl-live) which seem unrelated to one another.
The full file seems to be running a bit slow b/c the full GenBank record is
huge (~55 MB) but the CONTIG file does exactly what you said (runs out of
memory).

Chris

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Michael Rogoff
> Sent: Tuesday, May 02, 2006 10:32 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bug in genbank parsing: CONTIG gaps
> 
> 
> I've encountered a pretty serious bug in Bio::SeqIO when parsing certain
> genbank
> files that contain CONTIG entries with gaps.  One such record is
> NW_925173.
> 
> When I try to parse this file using Bio::SeqIO::genbank, it will enter an
> infinite loop and spin until it runs out of memory.
> 
> I'm pretty certain it relates to this bug:
> http://bugzilla.bioperl.org/show_bug.cgi?id=1319 which seems to indicate
> that
> genbank records with CONTIG gaps are not valid and can't be parsed.  But
> this
> bug actually claims to be fixed, which is strange, since looking at the
> code for
> FTLocationFactory (where the loop is) it's still right there.  I assume
> that
> this may be fixed in other contexts but is still not fixed in
> Bio::SeqIO::genbank?  Or am I doing something wrong?
> 
> I think that this should probably be filed as an open bug.  I would think
> that
> even if bioperl isn't interested in parsing this type of file via SeqIO,
> certainly you'd want to ensure that no finite input file would send the
> parser
> into an infinite loop.  Have others encountered this problem?  Is there
> any plan
> to address it?
> 
> Thanks very much for any information or help!
> 
> -Mike
> 
> P.S.  I've played around with my version of FTLocationFactory and it seems
> to
> actually work and parse the gaps.  I'm not sure if I've created other bugs
> or if
> it works in all cases, but at least the parser doesn't die.  I also don't
> know
> that my hacky code is appropriate for putting back in to BioPerl, but I'm
> happy
> to provide it if someone wants to check it out and/or consider it for
> checkin.
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list