[Bioperl-l] Bootstrap, root, reroot...

Mark A. Jensen maj at fortinbras.us
Thu Jul 9 18:02:01 UTC 2009


Hi Tristan--
Would you enter this in bugzilla? I did an overhaul of the root/reroot a while
back, and maybe you're running into some stuff I need to check out. 
Thanks a lot-
Mark
----- Original Message ----- 
From: "Tristan Lefebure" <tristan.lefebure at gmail.com>
To: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Thursday, July 09, 2009 11:50 AM
Subject: [Bioperl-l] Bootstrap, root, reroot...


> Hello,
> 
> I have been bumping into problems while rerooting trees that 
> contained bootstrap scores. Basically, after re-rooting the 
> tree, some scores end-up at the wrong place (i.e. node) and 
> some nodes lose their score. I found this thread from Bank 
> Beszter, back in 2007, that exactly explains the same 
> problems:
> 
> http://lists.open-bio.org/pipermail/bioperl-l/2007-
> May/025599.html
> 
> I attach a script that reproduces the bug and implements the 
> fix that Bank described (at least this is my understanding, 
> and it works on this example):
> 
> 
> #! /usr/bin/perl
> 
> use strict;
> use warnings;
> use Bio::TreeIO;
> 
> 
> my $in = Bio::TreeIO->new(-format => 'newick',
>    -fh => \*DATA,
>    -internal_node_id => 'bootstrap');
>    
> my $out = Bio::TreeIO->new(-format => 'newick', -file => 
> ">out.tree");
> 
> while( my $t = $in->next_tree ){
>    my $old_root = $t->get_root_node();
>    my ($b) = $t->find_node(-id =>"B");
>    my $b_anc = $b->ancestor;
>    $out->write_tree($t);
> 
> #reroot with B -> wrong, and the tree is kind of weird
>    $t->reroot($b);
>    $out->write_tree($t);
> 
> #reroot with B ancestor -> wrong
>    $t->reroot($b_anc);
>    $out->write_tree($t);
>    
>    #a fix, following Bank Beszteri description
>    my $node = $old_root;
>    while (my $anc_node = $node->ancestor) {
> $node->bootstrap($anc_node->bootstrap());
> $anc_node->bootstrap('');
> $node = $anc_node;
>    }
>    $out->write_tree($t); #->good this time
> }
> 
> 
> __DATA__
> (A:52,(B:46,C:50)68:11,D:70);
> 
> 
> Here is the output:
> 
> (A:52,(B:46,C:50)68:11,D:70);
> ((C:50,(A:52,D:70):11)68:46)B;
> (B:46,C:50,(A:52,D:70):11)68;
> (B:46,C:50,(A:52,D:70)68:11);
> 
> 
> Tree #2 and #3 have the score 68 moved to the wrong node, 
> while tree #4 is OK. (BTW tree #2 is really weird, except if 
> B, is the real ancestor (a fossil ?), it really does not 
> make much sense to me). 
> 
> My understanding here is that the problem is linked to the 
> well-known difficulty to differentiate node from branch 
> labels in newick trees. Bootstrap scores are branch 
> attributes not node attributes, but since Bio::TreeI has no 
> branch/edge/bipartition object they are attached to a node, 
> and in fact reflects the bootstrap score of the ancestral 
> branch leading to that node. Troubles naturally come when 
> you are dealing with an unrooted tree or reroot a tree: a 
> child can become an ancestor, and, if the bootstrap scores 
> is not moved from the old child to the new child, it will 
> end up attached at the wrong place (i.e. wrong node). 
> 
> I see several fix to that:
> 
> 1- incorporate Bank's fix into the root() method. I.e. if 
> there is bootstrap score, after re-rooting, the one on the 
> old to new ancestor path, should be moved to the right node. 
> 
> 2- Modify the way trees are stored in bioperl to incorporate 
> branch/edge/bipartition object, and move the bootstrap 
> scores to them. That won't be easy and will break many 
> things... 
> 
> 
> What do you think?
> 
> --Tristan
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>



More information about the Bioperl-l mailing list