[Biopython] identify triplet sequences

George Devaniranjan devaniranjan at gmail.com
Wed Jun 29 16:15:17 UTC 2011


Hi,

Not sure if this is a python or  bio-python question -but suggestions are
most welcome.

I have some FASTA sequences....like
AAAAWWWHHHHH
TTTYYYYYHGGGG
NNNNNGGGGFFFF

I extract from each sequence triplets moving from 1st residue and extracting
the 2nd, 3rd as one triplet then 2/3/4 as another triplet then 3/4/5 as
another triplet ...ect
So for the 1st sequence given above.....
AAA
AAA
AAW
AWW
.
.
.
so on.....

Now my question for 20amino acids there will be 8000 possible unique
combinations (20^3)

How can I classify them using python/biopython and write them out to 8000
unique text files .....is there a way to classify them without writing 8000
IF/ELSIF statements?
I want to see which sets of triplets has the hightest occourence.

Thank you.



More information about the Biopython mailing list