[Biopython-dev] Bio.AlignIO, Bio.Nexus, MrBayes, polymorphic sites, maximum line length
Cymon Cox
cy at cymon.org
Thu Dec 2 12:03:55 UTC 2010
On 2 December 2010 11:43, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Thu, Dec 2, 2010 at 10:50 AM, Nick Loman <n.j.loman at bham.ac.uk> wrote:
> > Hi there
> [...]
> > 2) When outputting long alignments in Nexus format, MrBayes refuses to
> read
> > the resulting files saying that the maximum line length is 19900
> characters.
> > I'm assuming that is not the maximum input to MrBayes and that it can
> handle
> > longer alignments if they are split in some way. Would it be possible for
> > Bio.Nexus to split alignments in the appropriate format?
>
> The file format details are not fresh in my mind, but I think that long
> sequences can be split over multiple lines#
This is valid interleaved Nexus format:
"""
#NEXUS
begin data;
Dimensions ntax=4 nchar=3;
Format interleave datatype=dna gap=-;
Matrix
taxon1 AA
taxon2 GG
taxon3 CC
taxon4 TT
taxon1 A
taxon2 G
taxon3 C
taxon4 T
;
end;
"""
Note, "interleave" on the format line. Also beware that some Nexus parsers
don't check that taxa in additional blocks are in the same order as the
first block - they just assume they are.
You can write interleaved Nexus formatted data with
Nexus.write_nexus_data(interleave_by_partition=True) provide you have a
character partition set.
Cheers, C.
> - so if the problem is
> just with how MrBayes parses the file, that might be fixable. Can
> you give me a test case for this (maybe generate a simple but
> large alignment in code) with the MrBayes call that fails?
>
> Peter
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
--
____________________________________________________________________
Cymon J. Cox
Auxiliary Investigator
Plant Systematics and Bioinformatics Research Group (PSB)
Centro de Ciencias do Mar (CCMAR) - CIMAR-Lab. Assoc.
Mailing address:
Rm. 2.77
Faculdade de Ciências e Tecnologia (FCT), Ed.7,
Universidade do Algarve
Campus de Gambelas
8005-139 Faro
Portugal
Phone: +0351 289800909 ext 7380
Fax: +0351 289800051
Email: cy at cymon.org, cymon at ualg.pt, cymon.cox at gmail.com
HomePage : http://www.ccmar.ualg.pt/home/index.php?id=202
-8.63/-6.77
More information about the Biopython-dev
mailing list