[BioPython] How to check codon usage for specific amino acid positions in a given set of CDS sequences
Animesh Agrawal
animesh.agrawal at anu.edu.au
Thu Jan 15 08:21:00 EST 2009
Hi Marco,
My apologies. Probably in my last mail I didn't make myself very clear. I have a protein which is about 475 amino acid long and is highly conserved (over 95%) among diffrent organisms. I have downloaded its CDS(coding sequence) .
I would like to calculate codon use frequenecy for important amino acid positions as you have put it very nicely in your reply:
"for a particular aminoacid position (e.g. the first, or the third,or the last) the codon usage for those aminoacids that are coded by more than a possible codon (e.g. Ala) the frequency with which every codon is used?"
For example in a set of four sequenecs
1 2 3
Ala Gly Ile
Seq1 GCT GCT ATT
Seq2 GCC GCC ATC
Seq3 GCA GCA ATA
Seq4 GCG GCG ATT
For first amino acid position i.e. Ala (which is coded by 4 codons) each codon is used once in 4 sequences that gives you frequency of 0.25 for each codon or for third amino acid position i.e. Ile ( which is coded by 3 codons) the ATT will give you frequency of 0.5 while other two will give you frequency of 0.25.
Cheers,
Animesh
----- Original Message -----
From: Giovanni Marco Dall'Olio <dalloliogm at fastwebnet.it>
Date: Thursday, January 15, 2009 10:45 pm
Subject: Re: [BioPython] How to check codon usage for specific amino acid positions in a given set of CDS sequences
To: Animesh Agrawal <animesh.agrawal at anu.edu.au>
Cc: biopython at lists.open-bio.org
> On Thu, Jan 15, 2009 at 10:21 AM, Animesh Agrawal
> <animesh.agrawal at anu.edu.au> wrote:
> > Hi,
> >
> > I have been trying to write a python script to do the codon
> wise alignment
> > of given nucleotide sequences.
>
> Note that there are many tools that already do a 'codon wise'
> alignment, if it is what I think you mean by it.
> I think t-coffee does this. It is always better to use a tool that
> already exists rather than develop a new one, if you can, because
> otherwise your results will be different to compare with other
> experiments.
>
>
> > I have downloaded CDS sequences (by a script
> > found on biopython mailing list) from genbank for a particular
> protein and
> > now would like to check codon usage for few specific amino
> acid positions.
>
> Can you provide a better example of what do you want to obtain?
> Do you want to know:
> - for a particular aminoacid position (e.g. the first, or the third,
> or the last) the codon usage in a set of sequences?
> - for those aminoacids that are coded by more than a possible codon
> (e.g. Ala) the frequency with which every codon is used?
> - the frequency at which every possible codon is used, in general.
>
> If I can give you an advice, I would spend some time in
> developing a
> test case first. For example, create a fake sequence and
> calculate the
> output that you expect from your experiment.
> It is a lot easier to describe your experiment to other people
> if you
> can provide the test cases you are using, it will be easier to
> understand what you want to do.
>
>
> > Could you please provide me few pointers on how to do that. I
> also want to
> > take this opportunity to thank you guys for excellent work on
> biopython> documentation. I am new to python, but I am able to
> use cookbook/tutorial
> > example for my work with relative ease.
> >
> > Cheers,
> >
> > Animesh Agrawal
> >
> > PhD Scholar
> >
> > Proteomics & Therapy Design Group
> >
> > Division of Molecular Biosciences
> >
> > The John Curtin School of Medical Research
> >
> > The Australian National University
> >
> > P.O. Box 334
> >
> > Canberra ACT 2601
> >
> > AUSTRALIA
> >
> > T: +61 2 6125 8303
> >
> >
> >
> > _______________________________________________
> > BioPython mailing list - BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> >
>
>
>
> --
>
> My blog on bioinformatics (now in English): http://bioinfoblog.it
More information about the BioPython
mailing list