[BioPython] The count method of a Seq (or MutableSeq) object

Bruce Southey bsouthey at gmail.com
Fri Mar 6 15:34:42 UTC 2009


Peter wrote:
> On Fri, Mar 6, 2009 at 3:06 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>   
>> I have already given one user case where overlapping counts is totally
>> inappropriate! Unique codon counting is extremely important in many areas
>> including gene prediction (possible splicing sites) and molecular evolution
>> (like codon usage).
>>     
>
> For codon counting NEITHER the current non-overlapping count nor the
> suggested overlapping count would be suitable.  So this doesn't really
> affect the overlapping versus non-overlapping debate.
>
> Peter
>   
With due respect, this does not make any sense.

If it is a cDNA then I can count say the different Lysine codons to find 
any usage bias using seq.count('AAA')/ 
(seq.count('AAA')+seq.count('AAG'). (Actually I am more interested in 
the occurrence of specific multiple codons than single codons.)
If you want the forward frames then just seq[0:].count('AAA'), 
seq[1:].count('AAA') and seq[2:].count('AAA') for frames 1, 2, and 3, 
respectively.

As you pointed out single characters are not relevant so what is relevant?

Bruce



More information about the Biopython mailing list