[Biopython-dev] Cookbook entry, concatenating nexus files

Peter biopython at maubp.freeserve.co.uk
Thu May 14 07:11:30 EDT 2009


On Thu, May 14, 2009 at 12:02 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> Oh right - I hadn't looked at David's example carefully enough earlier
> to work out which concatenation he was doing (by row or by column).
> It does make sense on re-reading.

I'd rephrase this bit of the intro:

<start>
It's a good idea, if possible, to make species-level phylogenetic
inferences bases on multiple genes because a) demographic processes
can lead gene-trees to diverge from species trees and b) journal
editors now this. Most of the alignment files supported by Biopython
allow you to write multiple alignments to the same file which makes
this easy. However, the nexus file format (used by PAUP* and Mr Bayes)
does not. In nexus files multiple alignments need to be represented as
different 'character partitions' within a data matrix that contains
one long sequence for each taxon.
<end>

Bio.AlignIO will in general write out one or more alignments to a
file.  It does NOT do any concatenation by column, required to give
the "supermatrix" which you want (which is why I get confused on the
first reading).  How about:

<start>
It's a good idea, if possible, to make species-level phylogenetic
inferences bases on multiple genes because (a) demographic processes
can lead gene-trees to diverge from species trees and (b) journal
editors know this.  [add stuff from Cymon's comment here?]

This is usually handled by creating a single "supermatrix" from
separate alignments for each gene.  i.e. You need a single alignment
containing one row for each taxon where the rows are the concatenated
pre-aligned sequences.  In NEXUS files (used by PAUP* and Mr Bayes)
multiple alignments can be explicitly represented as different
'character partitions' within a data matrix that contains one long
sequence for each taxon.  The Bio.Nexus module makes this relatively
straight forward.
<end>

Peter


More information about the Biopython-dev mailing list