[Biopython-dev] Bio.AlignIO, Bio.Nexus, MrBayes, polymorphic sites, maximum line length

Nick Loman n.j.loman at bham.ac.uk
Thu Dec 2 10:50:51 UTC 2010


Hi there

Two questions for the developers.

1) I wanted to extract polymorphic sites from a multiple alignment and 
ended up with some code like this:

    alignment = AlignIO.read(fn, "nexus")
    rows = len(alignment)
    new_alignment = None
    for n in xrange(alignment.get_alignment_length()):
        aln = alignment[:,n]
        if aln[0] * rows != aln:
            if new_alignment:
                new_alignment += alignment[:,n:n+1]
            else:
                new_alignment = alignment[:,n:n+1]
    if new_alignment:
        AlignIO.write([new_alignment], open(fn + ".ply", "w"), "nexus")

Is this the best way of doing it? Would a method call in AlignIO to do 
the same thing be useful to others?

2) When outputting long alignments in Nexus format, MrBayes refuses to 
read the resulting files saying that the maximum line length is 19900 
characters. I'm assuming that is not the maximum input to MrBayes and 
that it can handle longer alignments if they are split in some way. 
Would it be possible for Bio.Nexus to split alignments in the 
appropriate format?

Cheers

Nick





More information about the Biopython-dev mailing list