[Bioperl-l] Implemented Miller-Myers alignment algorithm

Yee Man ymc at paxil.stanford.edu
Mon Apr 21 13:00:22 EDT 2003


Hi, folks,

	After another month of hacking around, I followed the Miller-Myers
paper (1994) and implemented the Hirschberg divide and conquer algorithm. 
This is a global alignment algorithm (ie Needleman-Wunsch) but it can be
extended to become Smith-Waterman if I calculate the scores to find the
ends.

	It turns out my program even outperforms Phil Green's
implementation. It is 10% faster and uses only one-tenth of memory.
You can download my code from http://www.stanford.edu/~yeeman/global.tgz.
Here is the performance comparison:

example I used:
t1.fa and t2.fa at http://www.stanford.edu/~yeeman/
Both are DNA sequences with about 9800bp

compiler for both ssearch34 and my program:
gcc -g

command line:
./ssearch34 -n -3 -q -H -f 3 -g 1 -d 1 -r "3/-1" -b 1 t1.fa t2.fa
./global t1.fa t2.fa 

ssearch34: 
13.5 sec to find score
1 min 9 sec to calculate local alignment
If it takes two score calculation to find the end points of
lcoal alignment, the global alignment time will be 42 sec.
Uses 15MB of RAM

My Miller-Myers implementation:
38 sec to calculate alignment 
Uses 1.3MB of RAM

	It seems to me my program is 10% faster than Phil Green's
algorithm. But based on my understanding, I didn't even put any heuristics
into it like Phil Green. So far, my program seems to work alright with my
10 or so examples. Can someone also try it out and see how it performs?

	Ewan: If you like my program, I can modify it to handle protein as
well.

Thanks
Yee Man



More information about the Bioperl-l mailing list