[Biopython-dev] swissprot parsing performance comparisons
Andrew Dalke
dalke at acm.org
Wed Jan 10 13:16:46 EST 2001
Jeff:
>What about PySAT?
>http://www.embl-heidelberg.de/~chenna/PySAT/
Thanks for the reminder. I have the distribution, I'll
test it out was well.
>They have support for SwissProt, and their toolkit has been
>published. IIRC, theirs is an example of a less stringent python
>implementation of a parser.
I recall looking at their code and I agree. It is more like
the Swissknife way of doing things.
>This is an interesting statistic, and surprises me. I wonder what's
>slowing the perl parser, then, since it doesn't use callbacks?
The implementions do a lot of small regex parsing. Martel does
it all at once, and at the C level. That might be the difference.
It is hard to tell since I would need to better understand the
details of the perl implementations.
>That's embarrassing, since it's supposedly been checked against it! How
>many entries in release 38? Perhaps I need to update mine.
I don't know. I ran it and it failed at a record. I figured out
what was wrong with that record, changed the 1 to 0, and then
everything parsed fine.
>It does seem to match the philosophies of the languages...
True enough, although as you mentioned PySAT is less stringent
and more like the Perl implementations. Something to ponder :)
Andrew
dalke at acm.org
More information about the Biopython-dev
mailing list