[Bioperl-l] Bio::SeqIO::tinyseq

Dave Howorth dhoworth at mrc-lmb.cam.ac.uk
Wed Jan 28 06:24:57 EST 2004


Heikki Lehvaslaiho wrote:
> The best way to do this is to ignore the root level of the xml, use perl to 
> parse entries out of it, and pass entry xml only to the parser. This keeps 
> the memory usage down and you can parse as large file as you want.

Hmmm, it may be pragmatic but I'm sure there are other meanings of 
'best'.  By throwing away the root level you're losing any chance to 
validate the document or use other XML tools and you run the risk of 
making assumptions ...

>     local $/ = "</seqDiff>\n";

... There's no reason to expect there will ALWAYS be a newline following 
the tag in a valid XML file (suppose it's created by some XSLT tool that 
cares nothing about readability).  You're making an assumption above and 
beyond the specification about how the XML is represented.

Cheers, Dave



More information about the Bioperl-l mailing list