[BioPython] performance problem in ParserSupport.EventGenerator._get_set_flags

Fri Jun 13 10:40:31 EDT 2003

> This is with the patch, right? I just checked out a clean CVS and
> test_GenBank passes fine for me without any changes. We definitely
> need to have it working at the start :-).

Thats strange. I checked out a fresh version from CVS into a clean
directory and the test fails. Does it use something outside the build
directory? Maybe I have a wrong version of mxTools? (help(mx) shows
version 2.0)

> I quickly grepped through and didn't see anything. We should
> probably rename self.flags to self._flags to reflect that you
> shouldn't be using it from external classes if we get rid of
> _get_set_flags.

Yep

> That sounds like it should work. If you send a patch I'm happy to
> try and get it in.

I think, first we should find out about the failing test. 

> 10 percent is good. I'm all about it.

But there is still potential in it :-)
At the moment, (after my optimization) about 90% of my performance test
goes into Parser._do_callback. Of this, 60% is spend in endElement, 5%
in startElement and 7% in characters. The remaining time is spend in
_do_callback itself and for the recursion. So to get faster we could:
1. make _do_callback self faster (Don't see how)
2. make endElement faster.
3. reduce recursion somehow(?) function-calls are expensive in python.
4. Invent some clever algorithm

My problem whith the last two is, I still can't figure what all this
code is supposed to do (stupid me :-(

Andreas