[BioPython] Creating a parser for Quantarray data?
Andrew Dalke
dalke at dalkescientific.com
Tue Aug 26 14:25:16 EDT 2003
Peter Wilkinson:
> I have been looking through CVS for some, and I dont see any. Is it
> worth creating a parser within the parser framework within biopython
> (Martel), or shall I build a something separately.
>
> These can be large files, and I would want to implement something that
> is efficient.
>
> How would Martel handle a 20M Record like a Quantarray file? When I
> was parsing genomic Genbank files (Bacteria), the Genbank parser's
> performance started to suffer ...
Yeah, Martel is poor that way. I've got the RecordReaders as a
workaround for
when a single record is small enough. Otherwise, there's about a x5
memory
overhead.
I've got some highly experimental code which fixes part of the problem
(it ended up being a pure-python regexp engine) but it isn't usable and
has
problems of its own.
So for this task, it's likely best you write your own parser.
*sigh*
Anyone want to fund me for the month or two it will take to finish
up Martel. :)
Andrew
dalke at dalkescientific.com
More information about the BioPython
mailing list