[Biopython-dev] Newick support in Bio.TreeIO?
biopython at maubp.freeserve.co.uk
Tue Jul 28 16:48:50 UTC 2009
If you wanted a good multi-tree example file format for TreeIO, I would
suggest plain Newick trees. I am familiar with plain text files which contain
one Newick tree per line (with a terminating semi-colon), although in
principle they could be wrapped over many lines. The neighbour joining
(NJ) tree software QuickJoin from Thomas Mailund can certainly output
this kind of file. I would expect to be able to read and write such multi-tree
Newick files using Bio.TreeIO.
The obvious application of this (which I have used personally), was to
generate bootstrap trees on multiple machines in a cluster (or cores on
a single machine), e.g. 100 instances each of 10 bootstrap trees, giving
in total 1000 trees (which are then used either to build a consensus, or
allocate bootstrap support to the randomised master tree).
I wrote some code in python to do this bootstrapping step using the
splits defined by each edge (i.e. the two sets of nodes you get if the
edge was severed), which I represented using bit arrays, for use as
keys in a dictionary mapping the splits to the master tree's edges.
More information about the Biopython-dev