[Bioperl-l] Failed tests for FeatureIO

Torsten Seemann torsten.seemann at infotech.monash.edu.au
Mon Sep 25 22:15:13 UTC 2006


> PS: is there any point to using mode() here?  As Sendu points out, no 
> other Bioperl modules use it.  Just curious...

Many file formats that people use Bioperl to parse have this layout:

	HEADER
	LINE(S) FOR FEATURE 1
	LINE(S) FOR FEATURE 2
         ....ETC
	END

Most of the modules (eg. Bio::Tools::*) don't parse HEADER until the 
first call of next_feature(), kind of like a pull parser. Unfortunately 
these same modules have methods to return data from the HEADER (eg. 
sequence_name), which return undef if you haven't read the first feature 
yet...

My approach was to detect if the file handle was open for reading in 
_initialize(), so that I could parse the header before the first 
invocation of next_feature(), so that those HEADER-related methods would 
return the correct values no matter when called. It also makes the 
next_feature() implementation cleaner. The mode() detection is needed as 
the file/handle could actually be for write_feature() instead.

If this is silly, I'll gladly accept advice for alternatives :-)

A related question: the header of a .ptt file has a line describing how 
many features are to follow after the header. How does this fit into the 
Bio::FeatureIO model? I guess I have to buffer up all the 
write_feature() calls, detect when the FeatureIO object is destroyed/out 
of scope, and then write the header and buffered features out?

Any advice appreciated,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia




More information about the Bioperl-l mailing list