[Biopython-dev] Bio.Cluster.Tree -> Bio.Phylo

Andrew Sczesnak andrew.sczesnak at med.nyu.edu
Mon Apr 16 22:47:25 UTC 2012


I can describe two use cases from my own experience. First, the MAF 
parser I've been working on can pull the multiple alignment of some gene 
between a bunch of genomes. Thinking of recipes for the cookbook, I 
thought it would be neat to walk the user through constructing a 
distance matrix by hand (though you're right--more could be done to 
support this), clustering with Bio.Cluster and visualizing the result 
with Bio.Phylo. I like this example because it integrates several 
different parts of BioPython along with a lesson about inferring 
distances between sequences.

Second, for another project, I've been generating distance matrices 
based on the shared gene content of bacterial genomes and the 
presence-or-absence of orthologous groups in each. Presently, I ferry 
the matrices to a clustering program and then visualize the resulting 
trees in yet another tool. Looking into ways of streamlining this 
brought me back to Bio.Cluster, Bio.Phylo and the incompatibility of 
their tree objects.

I wonder, what would be the most elegant way of bridging the gap?


On 04/16/2012 06:15 PM, Eric Talevich wrote:
> On Mon, Apr 16, 2012 at 12:48 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu>  wrote:
>> Hi Eric,
>> I was playing with Bio.Cluster recently and noticed that trees generated by
>> that module are not compatible with Bio.Phylo. I think it would be useful if
>> output from Cluster could be manipulated with Phylo.
>> At first glance, it doesn't seem like it would be that tricky to add a
>> method of converting Bio.Cluster tree objects to Bio.Phylo tree objects, and
>> I would be happy to work on this. Before making an attempt, I wanted to get
>> your feedback on whether you think this would be useful and if you had
>> anything similar in the works already.
>> Best,
>> Andrew
> Hi Andrew,
> Interesting idea. It would be simple enough to add a "from_cluster"
> function or class method to either Phylo/BaseTree.py or
> Phylo/_utils.py. But as every scientist knows, just because we can
> doesn't necessarily mean we should. Do you have a specific use case in
> mind?
> If the main idea is to use Bio.Cluster to generate trees based on a
> measure of sequence distance, we could probably do more to support
> that. This code might also be worth posting on wiki "Phylo cookbook"
> page (http://www.biopython.org/wiki/Phylo_cookbook) to get more eyes
> on it while we consider merging it into the main distribution.
> -Eric

More information about the Biopython-dev mailing list