[Bioperl-l] Tree refactor? was Re: Bootstrap, root, reroot...
Mark Jensen
maj.fortinbras at gmail.com
Wed Jul 15 23:25:43 UTC 2009
Hey all-
I'm willing to spearhead this. I was thinking of a bioperl-dev module that
concretizes the B:T:TreeI and B:T:NodeI interfaces, to get started. I don't
think we have to spring a edge-based tree object on the unsuspecting masses
all at once, but write a Tree class that has all the capabilities defined
by the interface, and then some extras, as Tristan suggests in his post.
We can squeak it back into the core with some node-based->edge-based
conversion utilities, and possibly put the current implementation into a
deprecation cycle (but I'm thinking that's a bit drastic).
MAJ
I have some unformed thoughts about this....
> ----- Original Message ----- From: "Chris Fields" <cjfields at illinois.edu>
> To: "Aidan Budd" <budd at embl-heidelberg.de>
> Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
> Sent: Wednesday, July 15, 2009 6:11 PM
> Subject: [Bioperl-l] Tree refactor? was Re: Bootstrap, root, reroot...
>
>
>
>> On Jul 11, 2009, at 2:52 AM, Aidan Budd wrote:
>>
>> On Thu, 9 Jul 2009, Tristan Lefebure wrote:
>>>
>>> ...
>>>>
>>>> My understanding here is that the problem is linked to the
>>>> well-known difficulty to differentiate node from branch
>>>> labels in newick trees. Bootstrap scores are branch
>>>> attributes not node attributes, but since Bio::TreeI has no
>>>> branch/edge/bipartition object they are attached to a node,
>>>> and in fact reflects the bootstrap score of the ancestral
>>>> branch leading to that node. Troubles naturally come when
>>>> you are dealing with an unrooted tree or reroot a tree: a
>>>> child can become an ancestor, and, if the bootstrap scores
>>>> is not moved from the old child to the new child, it will
>>>> end up attached at the wrong place (i.e. wrong node).
>>>>
>>>> I see several fix to that:
>>>>
>>>> 1- incorporate Bank's fix into the root() method. I.e. if
>>>> there is bootstrap score, after re-rooting, the one on the
>>>> old to new ancestor path, should be moved to the right node.
>>>>
>>>> 2- Modify the way trees are stored in bioperl to incorporate
>>>> branch/edge/bipartition object, and move the bootstrap
>>>> scores to them. That won't be easy and will break many
>>>> things...
>>>>
>>>
>>> Just wanted to add that, from my point of view, it would be great if it
>>> were possible to add edge/branch objects as part of the bioperl trees.
>>> Perhaps so that the previous set of methods still behaved as before, but
>>> with some new methods on the trees such as get_splits() or
>>> get_branches() along with associated split/branch/etc. objects...?
>>>
>>> Being a bioperl user but keeping well away from coding objects in perl,
>>> the lack of such methods/objects meant I chose, in the end, not to use a
>>> bioperl solution to work with my trees (going instead for a homemade
>>> clunky python solution, where I'm happier with the OO stuff)
>>>
>>> No idea how difficult/problematic this would be to implement, though -
>>> just my 2 cents worth...
>>>
>>
>> Mark and Tristan have both indicated some of the problems that lie here,
>> so it's worth discussing this on the list. I think the best way to
>> approach this is to suggest what a proposed refactoring of
>> Bio::Tree-related classes would look like (i.e. how it would be done, what
>> is expected of said classes interface-wise, etc), and then come up with
>> data and cases where the current classes don't DTRT, preferably as tests we
>> can incorporate into the test suite.
>>
>> Note this will affect some of the key core classes we now have (seq
>> classes specifically, so memory management will be important). I'll have
>> my hands full with a few other refactors, so anyone out there willing to
>> take the reins on this one?
>>
>> chris
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
More information about the Bioperl-l
mailing list