[Bioperl-l] DNA Smith-Waterman?

Yee Man ymc@paxil.stanford.edu
Sun, 8 Dec 2002 10:53:02 -0800 (PST)


Thanks for your reply, Ewan.

> I have in my Wise2 package (the commandline program is dnal, look at
> www.ebi.ac.uk/Wise2) but have not bothered to link it into the XS system
> into Perl, although alot of the stubs are there. 
> 

BTW, what kind of optimization did you do beyond the vanilla
Smith-Waterman?

> There a series of different improvements to smith waterman you can do some
> of which are published, and some are just "knowledge" swapped between
> different practioners; some of the best people at this are Phil Green
> (wrote SWAT, which has some improvemnents there) and Guy Slater (in my
> group at the EBI); but there is a large number of people with different
> algorithmical tricks for speed - to be honest, I am not sure what Gotoh's
> Improvement is.
> 

Thanks for pointing out different places for optimizing the algorithm.
I will check them out later. 

The original vanilla Smith-Waterman as described in the 1981 paper
has a time complexity O(m*m*n). Gotoh's 1982 paper showed how to
reduce the gap insertion calculation to a recurrence relation such
that you can reduce the time complexity to O(m*n). You can find his
paper and other classic papers here:

http://poweredge.stanford.edu/BioinformaticsArchive/ClassicArticlesArchive/gotoh1982.pdf

> 
> To be more honest, once you have written a piece of code with sensible
> memory layout then you will only be changing the runtime speed by a factor
> of 0.9 or 0.8 at best; to tackle more problems in less time you need to
> move to heuristics such as BLAST, exonerate, BLAT or SSAHA which all use
> different seed-extension-alignment strategies. There is again alot of
> distributed knowledge in this area - you are more than welcome to get
> stuck in and try some ideas out, but I'd definitely suggest you check out
> some of these programs first off.
> 

We have been using BLAST and BLAT extensively already. I still couldn't
figure out why I can only get fragmented alignment with SSAHA though. 
But sometimes when we have idle CPUs and we would like to do the right
thing (ie SW), we want to run SW in some cases.

> 
> If you would like to do this, great! Come up with a pretty sensible
> directory structure that could be worked into bioperl-ext (ie, a standard
> Perl XS extension) and model the Bioperl bridge code after pSW; then tar
> it up and I would be happy to take a look at it for inclusion.
> 

Cool! I will try it out in my spare time then. :)

Best regards,
Yee Man