[Bioperl-l] An update for my DNA Smith-Waterman code

Yee Man ymc at paxil.stanford.edu
Mon Jan 27 10:01:00 EST 2003


Hi Aaron,

	I was reading the dropgsw.c. In the beginning, it says

/* the shortcuts used in this program prevent it from calculating scores
   that are less than the gap penalty for the first residue in a gap. As
   a result this code cannot be used with very large gap penalties, or
   with very short sequences, and probably should not be used with prss3.
*/

	It seems to me that means Phil Green's implementation technically
isn't really the correct one. And as I reported before ssearch
also complains when it aligns those two sequences with their default 
scores, so maybe there is a place for my implementations?

	Also I found that ssearch34 uses 17MB to align two 10,000bp
sequences. This is only half of what I used in my True Gotoh
implementation (35MB). I tried two other sequences of 15kbp. The memory
usage only goes up to 19MB whereas my program goes up to 82MB, so I guess
ssearch34 is really a linear space algorithm but it probably used quite a
lot of memory in something else.

Regards,
Yee Man 

On Sun, 26 Jan 2003, Aaron J Mackey wrote:

> 
> On Sun, 26 Jan 2003, Yee Man wrote:
> 
> > Finally get it running but I have to use my own scoring
> >
> > ssearch34 -n -q -H -d 1 -b 1 -m 0 -f 3 -g 1 -r +3/-1 t1.fa t2.fa
> >
> > This takes 50 sec in my machine but it is still faster than my fastest
> > implmentation (1 min 3 sec). However, if I use the default scoring, it
> > gives me the E() error and refuse to align. :(
> 
> Is that 50 second for the entire program runtime, or just for the
> alignment calculation (remember that it calculate a score first, and then
> starts over when calculating the alignment - this is a database search
> program, after all).  Run it with -d 0 and subtract that number from 50.
> 
> -Aaron
> 
> -- 
>  Aaron J Mackey
>  Pearson Laboratory
>  University of Virginia
>  (434) 924-2821
>  amackey at virginia.edu
> 
> 



More information about the Bioperl-l mailing list