[Biopython] Bio.Phylo: midpoint root?

Eric Talevich eric.talevich at gmail.com
Thu May 31 16:30:07 UTC 2012


On Thu, May 31, 2012 at 9:31 AM, Brandon Invergo <b.invergo at gmail.com>wrote:

> On Thu, 2012-05-31 at 00:36 -0400, Eric Talevich wrote:
> > On Wed, May 30, 2012 at 10:55 AM, Eric Talevich <eric.talevich at gmail.com
> >wrote:
> > I implemented this in an intuitive but very inefficient way, calculating
> > the pairwise distances between all tips of the tree. You can try it from
> > git:
> >
> https://github.com/biopython/biopython/commit/94c128bd428cc5d53b50edd1d2e4730ee212f530
> >
> > It would still be nice to see a better algorithm, if anyone has one on
> hand.
> >
> > -E
>
> I sped it up a little bit by getting rid of those nested for loops:
>
> https://github.com/brandoninvergo/biopython/commit/102189cd49d448423ee160a0a0ad891b58f56c26
>
> According to a naive benchmark of comparing execution times for the unit
> test, this version is about 40% faster (0.901s vs 0.524s on my
> computer). I'll do a pull request...
>
> As for the problem of accumulating floating point rounding errors,
> perhaps you can do the root operations on copies of the tree instead...
>
> -brandon
>
>
Looks better, thanks! I merged it.

I'll look into the rounding issue some more. It might be enough to make a
single copy of the tree, do all the rerooting and distance calculation
there, and use the original copy to calculate the outgroup branch length
and do a single rerooting.

Alternatively, I could add a separate tree method that generates pairwise
distances without rerooting the tree -- either producing a big dictionary,
or an iterable of ((node1, node2), distance) which could be easily fed to a
dictionary if needed.



More information about the Biopython mailing list