[BioPython] How to check codon usage for specific amino acid positions in a given set of CDS sequences

Giovanni Marco Dall'Olio dalloliogm at fastwebnet.it
Thu Jan 15 11:45:18 UTC 2009


On Thu, Jan 15, 2009 at 10:21 AM, Animesh Agrawal
<animesh.agrawal at anu.edu.au> wrote:
> Hi,
>
> I have been trying to write a python script to do the codon wise alignment
> of given nucleotide sequences.

Note that there are many tools that already do a 'codon wise'
alignment, if it is what I think you mean by it.
I think t-coffee does this. It is always better to use a tool that
already exists rather than develop a new one, if you can, because
otherwise your results will be different to compare with other
experiments.


> I have downloaded CDS sequences (by a script
> found on biopython mailing list) from genbank for a particular protein and
> now would like to check codon usage for few specific amino acid positions.

Can you provide a better example of what do you want to obtain?
Do you want to know:
- for a particular aminoacid position (e.g. the first, or the third,
or the last) the codon usage in a set of sequences?
- for those aminoacids that are coded by more than a possible codon
(e.g. Ala) the frequency with which every codon is used?
- the frequency at which every possible codon is used, in general.

If I can give you an advice, I would spend some time in developing a
test case first. For example, create a fake sequence and calculate the
output that you expect from your experiment.
It is a lot easier to describe your experiment to other people if you
can provide the test cases you are using, it will be easier to
understand what you want to do.


> Could you please provide me few pointers on how to do that. I also want to
> take this opportunity to thank you guys for excellent work on biopython
> documentation. I am new to python, but I am able to use cookbook/tutorial
> example for my work with relative ease.
>
> Cheers,
>
> Animesh Agrawal
>
> PhD Scholar
>
> Proteomics & Therapy Design Group
>
> Division of Molecular Biosciences
>
> The John Curtin School of Medical Research
>
> The Australian National University
>
> P.O. Box 334
>
> Canberra ACT 2601
>
> AUSTRALIA
>
> T: +61 2 6125 8303
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>



-- 

My blog on bioinformatics (now in English): http://bioinfoblog.it



More information about the Biopython mailing list