[Biopython] convert to interleaved nexus

natassa natassa_g_2000 at yahoo.com
Wed Mar 27 17:25:43 UTC 2013


Hi, 
My alignment was indeed very long: It is a concatenated sequence of ~3000 genes. So I am glad this conversation may stimulate discussions on extension of the AlignIO parser. As to the call to write_nexus_data, I will put it outside the if block. 
Thanks again, 
Natassa




________________________________
 From: Peter Cock <p.j.a.cock at googlemail.com>
To: cymon.cox at googlemail.com 
Cc: natassa <natassa_g_2000 at yahoo.com>; "biopython at biopython.org" <biopython at biopython.org> 
Sent: Wednesday, March 27, 2013 6:41 AM
Subject: Re: [Biopython] convert to interleaved nexus
 
On Wed, Mar 27, 2013 at 12:17 PM, Cymon Cox <cymon.cox at googlemail.com> wrote:
>> >
>> > How about we make AlignIO default to writing interleaved Nexus if the
>> > sequences are very long (say over 1000 bases)? That way for small
>> > alignments we continue to produce non-interleaved Nexus as now
>> > (which has proved reliable) and automatically interleave when is it
>> > more likely that parsers with fixed buffers like MrBayes would fail.
>> >
>> > Peter
>>
>> Ha! - it looks like I suggested this before, but the conversation died:
>> http://biopython.org/pipermail/biopython-dev/2010-December/008487.html
>>
>> CC'ing Cymon again ;)
>
> There's no downside as such to whether the matrix is interleaved or not:
> interleaved and sequential(flat) are both valid in the Nexus format.

There are probably some differences in terms of writing/parsing speed,
but that is a fairly minor concern next to portability/support.

> Unfortunately, some software parsers are not implemented to cope with both.
> Among phylogenetic software I would guess the most commonly expected format
> is interleaved. So...

In that case the length based default seems sensible to me - thus far the
only reported issue with defaulting to non-interlaced has been MyBayes'
limited buffer. Do you want to make this change to Bio/AlignIO/NexusIO.py
or should I?

Peter



More information about the Biopython mailing list