Bioperl: module for sequence content analysis?

Gatherer, D. (Derek) D.Gatherer@organon.nhe.akzonobel.nl
Mon, 17 Jan 2000 09:14:21 +0100


> 
>> Is there a Bioperl module that has a method returning the codon frequency
>> table, or amino acid frequency table, or some other kind of hashed
content
>> data for Bio::Seq objects?  I am most interested in overlapping triplet
>> frequencies in large FASTA-formatted databases.

>Not that I know of. If you write something, please contribute...

Yes, I hope I can.  Reading the Seq.pm POD over the weekend made me wonder
if perhaps the best way of doing it might be to add a method to Seq.pm...???
The translate function already does quite a lot of the spade work in terms
of checking the alphabet, and then setting up the loop over the sequence.
It might just be a matter of modifying the loop so that it slides ++ instead
of jumping +=3, and then racking up the triplets in a hash instead of
changing them to amino acids.

Is there perhaps a more parsimonious way of doing it?  If there is, I
haven't spotted it.  Except perhaps... I was thinking of doing this as a
different but similar function to translate, whereas it may be that they
share enough code to make it an option _within_ translate???  This is
perhaps quite sneaky, but I haven't thought it through properly.  

Also, should this thread transfer to perl-guts?

Derek Gatherer
Target Discovery
Organon Laboratories Ltd
Newhouse
Scotland ML1 5SH
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================