[Biopython-dev] Rerooting a tree with Bio.Phylo

Eric Talevich eric.talevich at gmail.com
Mon Mar 22 20:28:21 UTC 2010


On Mon, Mar 22, 2010 at 12:21 PM, Peter <biopython at maubp.freeserve.co.uk>wrote:

> Hi Eric,
>
> I've got a real example of a simple tree manipulation that I would like to
> handle via your new module. I have a (small) unrooted tree from a gene
> family in Newick format, which by construction includes an out-group
> (the same gene but from a more distant organism). I would like to reroot
> the tree so that this out-group is at the basal level.
>
> Can Bio.Phylo help me here?
>

In Bio.Nexus, would you normally have handled this with the method
root_with_outgroup? I intend to port that method to Bio.Phylo once I
understand it, but the existing code has been kind of hard for me to figure
out.

Let's address it here, then. Is there a detailed plain-text description
somewhere of how this operation should work in general?

Given that the outgroup taxon is already somewhere inside the existing
unrooted tree, I would guess something like:

0. Load the tree:

tree = Phylo.read('example.nwk', 'newick')

1. Locate the outgroup in the tree, remembering the lineage for future
operations:

outgroup_path = tree.get_path({'name': 'OUTGROUP'})  # or however you can
identify it

2. Tracing the outgroup lineage backwards, reattach the subclades to new
locations under a new root (or the old root, repurposed). Picturing the
unrooted tree as an arbitrarily rooted tree, invert everything above the
outgroup in the tree, but keep the descendants of those clades as they are:

# Untested, hardly even thought through, danger danger!
root = tree.root
old_clades = root.clades  # needed?
root.clades = []
new_parent = root
last = outgroup_path[-1]
for parent in outgroup_path[-2::-1]:
    siblings = [kid for kid in parent.clades if kid != last]
    new_parent.clades = # TODO
    new_parent = last
    last = parent
tree.rooted = True


Bio.Phylo does no internal bookkeeping, so it's OK (i.e. sometimes required)
to shuffle clades directly.

Is this what "root with outgroup" is supposed to do? What functionality in
Bio.Nexus.Trees.root_with_outgroup is missing here? And, do you happen to
have an example of a tree with edge cases that I could use for testing?


P.S. Why is Bio.Phylo.trim_str a public method?
>

Oops, I'll fix it.

Thanks,
Eric



More information about the Biopython-dev mailing list