[Bioperl-l] Bug in genbank parsing: CONTIG gaps

Hilmar Lapp hlapp at gmx.net
Thu May 4 16:30:05 UTC 2006


Infinite loop on a file you can download (i.e., as opposed to a file  
you tinkered with) is never ok. Could you file this as a bug report?  
And ideally attach your patch?

Thanks,

	-hilmar

On May 2, 2006, at 11:31 PM, Michael Rogoff wrote:

>
> I've encountered a pretty serious bug in Bio::SeqIO when parsing  
> certain genbank
> files that contain CONTIG entries with gaps.  One such record is  
> NW_925173.
>
> When I try to parse this file using Bio::SeqIO::genbank, it will  
> enter an
> infinite loop and spin until it runs out of memory.
>
> I'm pretty certain it relates to this bug:
> http://bugzilla.bioperl.org/show_bug.cgi?id=1319 which seems to  
> indicate that
> genbank records with CONTIG gaps are not valid and can't be  
> parsed.  But this
> bug actually claims to be fixed, which is strange, since looking at  
> the code for
> FTLocationFactory (where the loop is) it's still right there.  I  
> assume that
> this may be fixed in other contexts but is still not fixed in
> Bio::SeqIO::genbank?  Or am I doing something wrong?
>
> I think that this should probably be filed as an open bug.  I would  
> think that
> even if bioperl isn't interested in parsing this type of file via  
> SeqIO,
> certainly you'd want to ensure that no finite input file would send  
> the parser
> into an infinite loop.  Have others encountered this problem?  Is  
> there any plan
> to address it?
>
> Thanks very much for any information or help!
>
> -Mike
>
> P.S.  I've played around with my version of FTLocationFactory and  
> it seems to
> actually work and parse the gaps.  I'm not sure if I've created  
> other bugs or if
> it works in all cases, but at least the parser doesn't die.  I also  
> don't know
> that my hacky code is appropriate for putting back in to BioPerl,  
> but I'm happy
> to provide it if someone wants to check it out and/or consider it  
> for checkin.
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================








More information about the Bioperl-l mailing list