[Bioperl-l] Count or weight matrix in bioperl?

Cook, Malcolm MEC at stowers-institute.org
Fri Feb 17 18:15:53 UTC 2006


http://forkhead.cgb.ki.se/TFBS/ provides ability to generate position
frequency matrix from list of (presumaby aligned) sequences as follows:

#!/usr/bin/env perl	
use  TFBS::PatternGen::SimplePFM;
my @sequences = <>;
chomp @sequences;
print
TFBS::PatternGen::SimplePFM->new(-seq_list=>\@sequences)->pattern->rawpr
int;
exit 0;

The output when run on your example input shows that the order the
nucleotides is not the same as you expect (it is alphbetical):

1 0 0
1 1 2
0 0 0
1 2 1

Good luck,

TFBS installation requires signifigant dependencies, including bioperl
and PDL.

Malcolm Cook 

>-----Original Message-----
>From: bioperl-l-bounces at lists.open-bio.org 
>[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sam 
>Al-Droubi
>Sent: Friday, February 17, 2006 11:50 AM
>To: Torsten Seemann
>Cc: BioPerl list
>Subject: Re: [Bioperl-l] Count or weight matrix in bioperl?
>
>
>Torsten and all,
> 
> I don't think this will work for me for it only generates 
>statistics for a single sequence.  What I need is a count 
>matrix for each position for a number of DNA sequences.  In 
>other words, if I pass there 3 sequences to this function then 
>it returns the count for each postion for each nucleotide.
> 
> For example if I pass an array of sequences say: ATC,CCC,TTT
> then I should get a matrix back that will have count for 
>postion 1,2,3 for each A,C,T, or G like this:
> 
> 
>                 1    2   3
>      A        1    0    0
>      C        1    1    2
>      T        1    2    1     
>      G        0    0    0
> 
> Any idea of this is already built somewhere in bioperl?
> 
> Thank you.
> 
> 
> Torsten Seemann <torsten.seemann at infotech.monash.edu.au> 
>wrote:> Say I have an array of nucleotide sequences of of 
>length N. I want to calculate the count matrix (weight 
>matrix). That is for each position 1..N, I want to know how 
>many As, Cs ,Ts and Gs there are. Is the code to do this 
>already written in bioperl to build this matrix if I pass it 
>those strings?
>>   Please excuse my lack of knowledge as I am a new comer to 
>bioinformatics.
>
>Use the Bio::Tools::SeqStats module. The PDoc documentation 
>even has an 
>example similar to what you want to do:
>
>http://doc.bioperl.org/releases/bioperl-1.5.0-RC1/Bio/Tools/Seq
>Stats.html
>
>--Torsten Seemann
>
>
>
>
>Sincerely, 
>Sam Al-Droubi, M.S.
>saldroubi at yahoo.com
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>




More information about the Bioperl-l mailing list