Bioperl: Re: Bio::Tools::Blast

Gordon D. Pusch pusch@mcs.anl.gov
Thu, 27 Aug 1998 13:17:43 -0500


> -re- non-redundant database-builder, Gordon D. Pusch wrote,
> > Can anyone suggest a more elegant algorithm than the 
> > ``stupid-but-simple'' method outlined above ???
>
> As a last resort, I would look into suffix trees, which are very
> nice for such tasks, and have been used in connection w/ the yeast
> database at the MIPS in Munich.

Ummm... Actually, that was my =first= resort (sort of)... :-/

I've built a Berkeley-DB of what we call ``tail tags,'' which is a hash
of lists of IDs keyed by the last 20 aa of each sequence; we use these
for a number of different ``quick lookup'' purposes.

However, one needs to do a substantial amount of processing to boil the
lists of ``same tail-tag IDs'' down to a non-redundant set of sequences,
and there are some peculiarities in the output of my code that cause me
to suspect bugs in my reduction algorithm; hence, my desire to find
something simpler and more elegant...


--  Gordon D. Pusch   <pusch@mcs.anl.gov>

Disclaimer:  I'm a consultant collaborating with Argonne researchers;
I don't speak for ANL or the DOE --- and they *certainly* don't speak
for =ME= !!!

Claimer:  I report =ALL= SPAMvertisers to their ISP --- =NO= exceptions !!!

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================