[Biopython-dev] Bio.Cluster.Tree -> Bio.Phylo
Andrew Sczesnak
andrew.sczesnak at med.nyu.edu
Mon Apr 16 22:47:25 UTC 2012
Eric,
I can describe two use cases from my own experience. First, the MAF
parser I've been working on can pull the multiple alignment of some gene
between a bunch of genomes. Thinking of recipes for the cookbook, I
thought it would be neat to walk the user through constructing a
distance matrix by hand (though you're right--more could be done to
support this), clustering with Bio.Cluster and visualizing the result
with Bio.Phylo. I like this example because it integrates several
different parts of BioPython along with a lesson about inferring
distances between sequences.
Second, for another project, I've been generating distance matrices
based on the shared gene content of bacterial genomes and the
presence-or-absence of orthologous groups in each. Presently, I ferry
the matrices to a clustering program and then visualize the resulting
trees in yet another tool. Looking into ways of streamlining this
brought me back to Bio.Cluster, Bio.Phylo and the incompatibility of
their tree objects.
I wonder, what would be the most elegant way of bridging the gap?
Best,
Andrew
On 04/16/2012 06:15 PM, Eric Talevich wrote:
> On Mon, Apr 16, 2012 at 12:48 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
>> Hi Eric,
>>
>> I was playing with Bio.Cluster recently and noticed that trees generated by
>> that module are not compatible with Bio.Phylo. I think it would be useful if
>> output from Cluster could be manipulated with Phylo.
>>
>> At first glance, it doesn't seem like it would be that tricky to add a
>> method of converting Bio.Cluster tree objects to Bio.Phylo tree objects, and
>> I would be happy to work on this. Before making an attempt, I wanted to get
>> your feedback on whether you think this would be useful and if you had
>> anything similar in the works already.
>>
>>
>> Best,
>> Andrew
>
> Hi Andrew,
>
> Interesting idea. It would be simple enough to add a "from_cluster"
> function or class method to either Phylo/BaseTree.py or
> Phylo/_utils.py. But as every scientist knows, just because we can
> doesn't necessarily mean we should. Do you have a specific use case in
> mind?
>
> If the main idea is to use Bio.Cluster to generate trees based on a
> measure of sequence distance, we could probably do more to support
> that. This code might also be worth posting on wiki "Phylo cookbook"
> page (http://www.biopython.org/wiki/Phylo_cookbook) to get more eyes
> on it while we consider merging it into the main distribution.
>
> -Eric
More information about the Biopython-dev
mailing list