[Biopython-dev] Rerooting a tree with Bio.Phylo
Eric Talevich
eric.talevich at gmail.com
Thu Mar 25 16:27:23 EDT 2010
On Wed, Mar 24, 2010 at 11:16 AM, Peter <biopython at maubp.freeserve.co.uk>wrote:
> On Mon, Mar 22, 2010 at 9:48 PM, Peter <biopython at maubp.freeserve.co.uk>
> wrote:
> >> In Bio.Nexus, would you normally have handled this with the method
> >> root_with_outgroup? I intend to port that method to Bio.Phylo once I
> >> understand it, but the existing code has been kind of hard for me to
> figure
> >> out.
> >
> > I've just got a quick answer for you now tonight: I've not used Bio.Nexus
> > to try and do this - I'll try to get back to you in more depth tomorrow.
>
> Here is an example using Bio.Nexus.Trees to reroot with an outgroup.
>
> [...]
>
> In my example, the outgroup originally has a branch length of 0.00145.
> A new root node was created (here #12) with two children, one with a
> branch length of zero (#5, the outgroup) and one with the full length
> (#3, branch length 0.00145). Essentially this new root node (#12) and
> the outgroup (#5) are now both right at the base of the tree.
>
> There is more than one what to do this though. For example FigTree
> seems to introduce a new root node half way along the outgroup branch
> (replacing the edge with two edges of half its length). This way the
> new root node represents the last common ancestor of the outgroup and
> the ingroup (everything else), although putting it at the mid point is
> perhaps a little arbitrary.
>
> Peter
>
I looked up this section in *Inferring Phylogenies* and found no decisive
statement on how it should be done. I gathered:
1. The new root can be placed anywhere along the branch between the outgroup
and its ancestor.
2. Another way to root a tree is by assuming a molecular clock -- place the
root so that the distances to all the tips are roughly equal.
So FigTree and Bio.Nexus are both doing reasonable things. (PyCogent doesn't
seem to support this operation, as far as I can tell.)
Thinking of this operation as extending the tree further back in time, where
the (monophyletic) tree without the outgroup is a sub-clade of the larger
rooted tree we're introducing -- it makes sense to me that the branch length
of the outgroup should represent the total evolutionary distance from the
root of the monophyletic sub-clade to the outgroup. Based on that, I'm
tempted to do the opposite of Bio.Nexus, letting the outgroup keep its
original branch length, and assigning a length of 0 to the branch leading to
the remaining sub-clade. Then by default we get something resembling a
trifucating root, and the user can shift the actual location of the root
further back without too much difficulty.
Alternatives:
- Take a hint from the molecular clock, and try to equalize the distance
from the root to the outgroup and the farthest tip of the main subclade.
Problem: in your example the outgroup is not the longest branch, so this
would be equivalent to the version I proposed above. The root->subclade
branch would only be nonzero sometimes, and it might surprise you when that
happens.
- Offer a separate method, root_by_clock, which does the expected thing, and
can be used to determine good branch lengths at the root after the outgroup
operation, if desired.
- Combine: add a keyword argument to root_with_outgroup (like
molecular_clock=False) which triggers Alternative #1.
I'll play with this some more and post an example implementation for you to
review.
Thanks for your help,
Eric
More information about the Biopython-dev
mailing list