Feature #3427: Cache paths in Bio.Phylo trees for later use

Author: Ben Morris
I'm doing some analyses using Bio.Phylo in which I need to find many distances between pairs of taxa, and have found that this can be quite slow as currently implemented.

My current solution is to extend Newick.Tree objects and cache the result of the get_path function. This way, after finding the distance between species A and B, finding the distance between A and C doesn't require recomputing the path from A to the root. Example:

class CachingTree(bp.Newick.Tree):
    _paths = {}
    def __init__(self, tree):

    def get_path(self, target, **kwargs):
        if not target in self._paths:
            self._paths[target] = bp.Newick.Tree.get_path(self, target=target,
        return self._paths[target]


Should this functionality be incorporated into the BaseTree class itself?

