[Biopython-dev] Newick support in Bio.TreeIO?

Peter biopython at maubp.freeserve.co.uk
Tue Jul 28 16:48:50 UTC 2009


Hi Eric,

If you wanted a good multi-tree example file format for TreeIO, I would
suggest plain Newick trees. I am familiar with plain text files which contain
one Newick tree per line (with a terminating semi-colon), although in
principle they could be wrapped over many lines. The neighbour joining
(NJ) tree software QuickJoin from Thomas Mailund can certainly output
this kind of file. I would expect to be able to read and write such multi-tree
Newick files using Bio.TreeIO.

http://www.daimi.au.dk/~mailund/quick-join.html

The obvious application of this (which I have used personally), was to
generate bootstrap trees on multiple machines in a cluster (or cores on
a single machine), e.g. 100 instances each of 10 bootstrap trees, giving
in total 1000 trees (which are then used either to build a consensus, or
allocate bootstrap support to the randomised master tree).

I wrote some code in python to do this bootstrapping step using the
splits defined by each edge (i.e. the two sets of nodes you get if the
edge was severed), which I represented using bit arrays, for use as
keys in a dictionary mapping the splits to the master tree's edges.

Peter



More information about the Biopython-dev mailing list