[BioPython] Psyco JIT compiler for Python
Andrew Dalke
dalke@dalkescientific.com
Sun, 12 Jan 2003 21:38:54 -0700
There was a post to c.l.py by Tim Churches on using Psyco for boosting
the performance of HMMs. Psyco is a just-in-time compiler for Python.
Their paper on it is http://www.biomedcentral.com/1472-6947/2/9/ .
It says:
> HMM standardisation of one million address records on the PC platform
> took 14,061 seconds (234 minutes), or 5832 seconds (97 minutes) with the
> Psyco just-in-time Python compiler enabled [30].
This was interesting enough that I decided to give it a go myself against
the 'pairwise2' module. The script is at the end.
Here's the numbers I got:
Pure Python: 7.3 seconds
Pure with Psyco: 3.7 seconds
with cpairwise2: 14.5 seconds
Psyco cpairwise2: 12.5 seconds
Yes, the C implementation is slower than the Python one for this
case, by a factor of 2! I haven't figured out if it's because
of the parameters I chose or other reasons.
In any case, it does appear that Psyco gives about a factor of 2
speedup, which agrees with what was found in the HMM paper. Very nice.
Andrew
dalke@dalkescientific.com
import random, time
from cStringIO import StringIO
from Bio import pairwise2
import psyco
def make_strings(n, freq):
sio1 = StringIO()
sio2 = StringIO()
for i in range(n):
if random.randrange(freq):
c = random.choice("ATCG")
sio1.write(c)
sio2.write(c)
else:
sio1.write(random.choice("ATCG"))
sio2.write(random.choice("ATCG"))
return sio1.getvalue(), sio2.getvalue()
# length 300, 80% similarity
s1, s2 = make_strings(300, 5)
print pairwise2._make_score_matrix_fast
t1 = time.clock()
pairwise2.align.globalxx(s1, s2)
t2 = time.clock()
print t2-t1
psyco.bind(pairwise2._align)
psyco.bind(pairwise2._make_score_matrix_fast)
psyco.bind(pairwise2._make_score_matrix_generic)
psyco.bind(pairwise2._find_start)
psyco.bind(pairwise2._recover_alignments)
t1 = time.clock()
pairwise2.align.globalxx(s1, s2)
t2 = time.clock()
print t2-t1