[BioPython] Psyco JIT compiler for Python

Andrew Dalke dalke@dalkescientific.com
Sun, 12 Jan 2003 21:38:54 -0700


There was a post to c.l.py by Tim Churches on using Psyco for boosting
the performance of HMMs.  Psyco is a just-in-time compiler for Python.
Their paper on it is http://www.biomedcentral.com/1472-6947/2/9/ .

It says:
> HMM standardisation of one million address records on the PC platform
> took 14,061 seconds (234 minutes), or 5832 seconds (97 minutes) with the
> Psyco just-in-time Python compiler enabled [30].

This was interesting enough that I decided to give it a go myself against
the 'pairwise2' module.  The script is at the end.

Here's the numbers I got:
  Pure Python:       7.3 seconds
  Pure with Psyco:   3.7 seconds
  with cpairwise2:  14.5 seconds
  Psyco cpairwise2: 12.5 seconds

Yes, the C implementation is slower than the Python one for this
case, by a factor of 2!  I haven't figured out if it's because
of the parameters I chose or other reasons.

In any case, it does appear that Psyco gives about a factor of 2
speedup, which agrees with what was found in the HMM paper.  Very nice.

					Andrew
					dalke@dalkescientific.com


import random, time
from cStringIO import StringIO

from Bio import pairwise2
import psyco

def make_strings(n, freq):
    sio1 = StringIO()
    sio2 = StringIO()
    for i in range(n):
        if random.randrange(freq):
            c = random.choice("ATCG")
            sio1.write(c)
            sio2.write(c)
        else:
            sio1.write(random.choice("ATCG"))
            sio2.write(random.choice("ATCG"))

    return sio1.getvalue(), sio2.getvalue()

# length 300, 80% similarity
s1, s2 = make_strings(300, 5)

print pairwise2._make_score_matrix_fast

t1 = time.clock()
pairwise2.align.globalxx(s1, s2)
t2 = time.clock()
print t2-t1

psyco.bind(pairwise2._align)
psyco.bind(pairwise2._make_score_matrix_fast)
psyco.bind(pairwise2._make_score_matrix_generic)
psyco.bind(pairwise2._find_start)
psyco.bind(pairwise2._recover_alignments)

t1 = time.clock()
pairwise2.align.globalxx(s1, s2)
t2 = time.clock()
print t2-t1