[Bioperl-l] Count or weight matrix in bioperl?

Sam Al-Droubi saldroubi at yahoo.com
Fri Feb 17 17:49:40 UTC 2006


Torsten and all,
 
 I don't think this will work for me for it only generates statistics for a single sequence.  What I need is a count matrix for each position for a number of DNA sequences.  In other words, if I pass there 3 sequences to this function then it returns the count for each postion for each nucleotide.
 
 For example if I pass an array of sequences say: ATC,CCC,TTT
 then I should get a matrix back that will have count for postion 1,2,3 for each A,C,T, or G like this:
 
 
                 1    2   3
      A        1    0    0
      C        1    1    2
      T        1    2    1     
      G        0    0    0
 
 Any idea of this is already built somewhere in bioperl?
 
 Thank you.
 
 
 Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:> Say I have an array of nucleotide sequences of of length N. I want to calculate the count matrix (weight matrix). That is for each position 1..N, I want to know how many As, Cs ,Ts and Gs there are. Is the code to do this already written in bioperl to build this matrix if I pass it those strings?
>   Please excuse my lack of knowledge as I am a new comer to bioinformatics.

Use the Bio::Tools::SeqStats module. The PDoc documentation even has an 
example similar to what you want to do:

http://doc.bioperl.org/releases/bioperl-1.5.0-RC1/Bio/Tools/SeqStats.html

--Torsten Seemann




Sincerely, 
Sam Al-Droubi, M.S.
saldroubi at yahoo.com



More information about the Bioperl-l mailing list