[Biopython-dev] Performance of Bio.File.UndoHandle
Jeffrey Chang
jchang at jeffchang.com
Mon Oct 13 00:38:49 EDT 2003
On Friday, October 10, 2003, at 10:11 AM, Michael Hoffman wrote:
> I have long wondered about how much the use of Bio.File.UndoHandle
> slows things down (it has additional checks for every read
> operation). Here are some results:
[cut, reading a file is slower when using an UndoHandle]
> There is about a 150% increase in the amount of time it takes to do
> input using readline() with UndoHandle.
> This kind of increase on basic I/O means much one one is doing big
> jobs, in my opinion. I wasn't volunteering to rewrite anything to not
> use UndoHandle but people might consider it when writing future
> stuff. And I might try rewriting some stuff anyway. Any thoughts?
The UndoHandle creates overhead on readline due to its extra if checks
and function calls.
def readline(self, *args, **keywds):
if self._saved:
line = self._saved.pop(0)
else:
line = self._handle.readline(*args,**keywds)
return line
Also, passing *args and **keywds may incur another performance penalty,
but I don't know how much.
The best way to speed this up might be to recode the class in C as a
type. This would help because the if statement would be evaluated in
C, and also you can cache the self._handle.readline for a faster
function lookup.
Jeff
More information about the Biopython-dev
mailing list