[Bioperl-l] Bio::Tree::Tree -- rerooting & bootstrap problem

Bank Beszteri Bank.Beszteri at awi.de
Mon May 14 13:20:07 UTC 2007


Dear Jason,

thanks for your answer! Sorry about having been ambiguous - it is clear 
that bootstrap values are parsed as ids from newick files, I had no 
problem with that, it was only the first step of the explanation of my 
problem, which was the rerooting issue.

Thanks for your example code as well, it is indeed really useful to 
illustrate the problem. I modified the original tree a bit to make my 
point clearer:

In your example, there are two internal node ids in a four-taxon tree. 
This is not a realistic situtation for bootstrap values, because 
bootstrap values are attached to bipartitions of terminal nodes, i.e., 
edges / branches of a tree (in what proportion of the bootstrap 
replicates was a particular bipartition recovered - an alternative 
representation of bootstraps, like produced e.g. by PAUP, is indeed a 
"taxon bipartition table"). This means that in a four taxon tree, we can 
have at most one bootstrap value - corresponding to the single 
non-trivial bipartition (all other bipartitions are trivial, i.e., they 
separate a terminal node from the rest).

So here is an example 4-taxon tree with a bootstrap value:

(A:52,(B:46,C:50)68:11,D:70);

After rerooting at node B (using your example code) it looks like

((B:46,C:50,(A:52,D:70):11)68);

Now there are two problems:
    1) this seems to be a small problem with TreeIO rather than with 
rerooting: there is an extra pair of parentheses around the whole tree;

but more importantly: 
    2) the bootstrap value appears at the root node, which is not 
sensible according to the convention that "each node stores the 
bootstrap value belonging to the branch linking it to its ancestor". You 
would like the bootstrap value appear at the node connecting A & D in 
this situation, which would look like

(B:46,C:50,(A:52,D:70)68:11);

because in  this new situation, this position would correspond to the 
same bipartition as in the original tree [which is (A,D)(B,C)].

In the meanwhile, I got a mail showing me the solution (thx Daniel!), 
which is in fact pretty simple: all that has to be done is go through 
the nodes on the path from the old to the new root after rerooting, and 
for each node, take the bootstrap values from its ancestor (and remove 
it from the ancestor). This leaves the root node without a bootstrap 
value, which is exactly what you want (because it has no branch 
connecting it to its ancestor, there is no sensible bootstrap value 
attached to a root node).

So this exercise tells me that bootstraps and "real" node ids should be 
handled in different manners when rerooting: real ids should of course 
stay with the nodes, whereas bootstrap values on the path between the 
new and old root should move over to the other end of the corresponding 
branch.

Best wishes,

Bank

Jason Stajich wrote:
>
> On May 10, 2007, at 9:13 AM, Bank Beszteri wrote:
>
>> Dear Bioperl folks,
>>
>> I´m trying to use Bio::Tree::Tree for manipulating phylogenetic trees, 
>> but in some things it did not behave as I expected it to, so I had to 
>> look inside a bit.
>> In particular, I had problems with mixed up bootstrap values after 
>> re-rooting. After looking into the Bio::Tree::Tree data structures, it 
>> seems that
>>
>> a) bootstrap values are stored as attributes of nodes of the tree [to my 
>> understanding, they should rather be attributes of branches but 
>> Bio::Tree::Tree apparently tries to simplify away branches]; each node 
>> stores the bootstrap value belonging to the branch that connects it to 
>> its ancestor node (I´m reading in trees from Newick strings, and 
>> bootstrap values arrive in the id fields of internal branches)
>
> Please feel free to suggest an alternative implementation if you don't 
> agree with the object model.    It has worked quite well in our hands 
> so I'd be all ears for someone wanting to get in an do some more work 
> on it.
>
> We have answered the question as to why bootstrap values are internal 
> ids many times on this list and I believe on the wiki -- the parser 
> can't tell the difference between a node id and a bootstrap value 
> because nexus uses the same slot for both.  if you know you have 
> bootstrap values in the internal node it is trivial to process your 
> tree and copy the values over.  
>
>
> for my $node ( grep { ! $_->is_Leaf } $tree->get_all_nodes ) {
>  $node->bootstrap($node->id); 
>  $node->id('');
> }
>
> I just added this as a method to TreeFunctionI so that it can be 
> easily called now to help satisfy everyone who hopes that the toolkit 
> can guess whether the internal nodes are bootstraps or identifiers.
>
>
>>
>> b) when re-rooting a tree, bootstrap values stay with the same node 
>> where they were before. Because the node that used to be the ancestor of 
>> a particular node in the original tree might have become its descendant 
>> after re-rooting, the bootstrap values are being mixed up.
>>
>> Can you confirm my conclusion? Whether yes or no, have you got an easy 
>> workaround or alternative solution to re-rooting trees (without having 
>> to touch the reroot method) or any other hints that could be useful for 
>> me to deal with this issue?
>>
>
> I think you are right, but I am not clear what should be value for the 
> internal node attached to the root now.
>
> Note that is always helpful to provide example code illustrating your 
> problem.  Here is an example which I think illustrates your problem.
>
> use Bio::TreeIO;
>
> my $in = Bio::TreeIO->new(-format => 'newick',
>   -fh => \*DATA);
> my $out = Bio::TreeIO->new(-format => 'newick');
> while( my $t = $in->next_tree ){
>     my ($a) = $t->find_node(-id =>"A");
>     $out->write_tree($t);
>     $t->reroot($a);
>     $out->write_tree($t);
> }
> __DATA__
> (((A:5,B:5)90:2,C:4)25:3,D:10);
>
>
>> Cheers,
>>
>> Bank
>>
>>
>>
>> --
>> Dr. Bánk Beszteri
>> Alfred Wegener Institute for Polar and Marine Research
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org <mailto:jason at bioperl.org>
> http://jason.open-bio.org/
>
>





More information about the Bioperl-l mailing list