[Biopython-dev] Performance of Bio.File.UndoHandle

Jeffrey Chang jchang at jeffchang.com
Mon Oct 13 00:38:49 EDT 2003


On Friday, October 10, 2003, at 10:11  AM, Michael Hoffman wrote:

> I have long wondered about how much the use of Bio.File.UndoHandle
> slows things down (it has additional checks for every read
> operation). Here are some results:

[cut, reading a file is slower when using an UndoHandle]

> There is about a 150% increase in the amount of time it takes to do
> input using readline() with UndoHandle.

> This kind of increase on basic I/O means much one one is doing big
> jobs, in my opinion. I wasn't volunteering to rewrite anything to not
> use UndoHandle but people might consider it when writing future
> stuff. And I might try rewriting some stuff anyway. Any thoughts?

The UndoHandle creates overhead on readline due to its extra if checks 
and function calls.

     def readline(self, *args, **keywds):
         if self._saved:
             line = self._saved.pop(0)
         else:
             line = self._handle.readline(*args,**keywds)
         return line

Also, passing *args and **keywds may incur another performance penalty, 
but I don't know how much.

The best way to speed this up might be to recode the class in C as a 
type.  This would help because the if statement would be evaluated in 
C, and also you can cache the self._handle.readline for a faster 
function lookup.

Jeff




More information about the Biopython-dev mailing list