[Biopython-dev] Getting ready for a release, II
Andrew Dalke
dalke at dalkescientific.com
Mon Feb 14 09:09:06 EST 2005
Hi all,
Peter:
> I filed bug 1747 as "major" and feel it renders the GenBank parser
> effectively useless for large genomes.
I saw that bug report when it came in a couple weeks ago but I was busy
at a client site.
One of the fundamental problems with this implementation of Martel
is that it parses a record in memory and uses about 4x as much memory
as the record. The slowness for large records comes from hitting
swap. It can't be fixed without some non-trivial changes to Martel;
basically a rewrite. If anyone wants to tackle rewriting a regex
engine I have some comments about what needs to be done. As for me
I haven't touched the code in years because I haven't needed that
capability and other tasks (including paying work) keep me busy.
Andrew
dalke at dalkescientific.com
More information about the Biopython-dev
mailing list