[Biopython] Sequence alignment with multiple proteins
Cymon Cox
cy at cymon.org
Thu May 14 17:29:47 UTC 2009
Hi Michael,
2009/5/14 Fahy, Michael <fahy at chapman.edu>
> This is not strictly a BioPython question but I'm using BioPython for
> the work.
>
> I have a set of 45 proteins and 10 species. I have a representative
> orthologous protein from each set for each of the 10 species. I'm
> trying to build a phylogenetic tree by aligning the data from the 10
> species. I've tried concatenating the 45 protein sequences for each of
> the 10 species and aligning the concatenated sequences but this has
> produced results that do not make sense. What do you recommend for such
> a problem?
The way I (and I suspect most others) approach this is to align each protein
data individually (ie you'll have 45 separate protein alignments) and then
concatenated them into one super-matrix.
Currently, Bio.AlignIO does not support column to column concatenation of
data. But by happy coincidence, David Winter, posted today that he has
included a cookbook example of how to combine alignments using the Bio.Nexus
interface - you can find the example here:
http://biopython.org/wiki/Concatenate_nexus
If you alignment viewer does not support export in Nexus format, you can use
Bio.AlignIO to convert the alignment to Nexus.
Cheers, Cymon
--
More information about the Biopython
mailing list