[Bioperl-l] Bootstrap, root, reroot...

Tristan Lefebure tristan.lefebure at gmail.com
Thu Jul 9 19:18:39 UTC 2009


I just add a quick look at the reroot() function of TreeFunctionsI, and it
looks like that what should be done for the bootstrap scores is what is
already done for the branch lengths. See this loop starting line 954:

    # reverse the ancestor & children pointers
    my $former_anc = $tmp_node->ancestor;
    my @path_from_oldroot = ($self->get_lineage_nodes($tmp_node),
$tmp_node);
    for (my $i = 0; $i < @path_from_oldroot - 1; $i++) {
        my $current = $path_from_oldroot[$i];
        my $next = $path_from_oldroot[$i + 1];
        $current->remove_Descendent($next);
        $current->branch_length($next->branch_length);
        $next->add_Descendent($current);
    }

 It makes sense to me to treat bootstrap and branch lenght in a similar way:
the branch lengths are stored inside the node object, but as the bootstrap,
they really are branch attributes... Nope?

-Tristan

On Thu, Jul 9, 2009 at 2:30 PM, Tristan Lefebure <tristan.lefebure at gmail.com
> wrote:

> Done. bug #2877.
> -Tristan
>
> On Thursday 09 July 2009 14:02:01 Mark A. Jensen wrote:
> > Hi Tristan--
> > Would you enter this in bugzilla? I did an overhaul of
> > the root/reroot a while back, and maybe you're running
> > into some stuff I need to check out. Thanks a lot-
> > Mark
> > ----- Original Message -----
> > From: "Tristan Lefebure" <tristan.lefebure at gmail.com>
> > To: "BioPerl List" <bioperl-l at lists.open-bio.org>
> > Sent: Thursday, July 09, 2009 11:50 AM
> > Subject: [Bioperl-l] Bootstrap, root, reroot...
> >
> > > Hello,
> > >
> > > I have been bumping into problems while rerooting trees
> > > that contained bootstrap scores. Basically, after
> > > re-rooting the tree, some scores end-up at the wrong
> > > place (i.e. node) and some nodes lose their score. I
> > > found this thread from Bank Beszter, back in 2007, that
> > > exactly explains the same problems:
> > >
> > > http://lists.open-bio.org/pipermail/bioperl-l/2007-
> > > May/025599.html
> > >
> > > I attach a script that reproduces the bug and
> > > implements the fix that Bank described (at least this
> > > is my understanding, and it works on this example):
> > >
> > >
> > > #! /usr/bin/perl
> > >
> > > use strict;
> > > use warnings;
> > > use Bio::TreeIO;
> > >
> > >
> > > my $in = Bio::TreeIO->new(-format => 'newick',
> > >    -fh => \*DATA,
> > >    -internal_node_id => 'bootstrap');
> > >
> > > my $out = Bio::TreeIO->new(-format => 'newick', -file
> > > => ">out.tree");
> > >
> > > while( my $t = $in->next_tree ){
> > >    my $old_root = $t->get_root_node();
> > >    my ($b) = $t->find_node(-id =>"B");
> > >    my $b_anc = $b->ancestor;
> > >    $out->write_tree($t);
> > >
> > > #reroot with B -> wrong, and the tree is kind of weird
> > >    $t->reroot($b);
> > >    $out->write_tree($t);
> > >
> > > #reroot with B ancestor -> wrong
> > >    $t->reroot($b_anc);
> > >    $out->write_tree($t);
> > >
> > >    #a fix, following Bank Beszteri description
> > >    my $node = $old_root;
> > >    while (my $anc_node = $node->ancestor) {
> > > $node->bootstrap($anc_node->bootstrap());
> > > $anc_node->bootstrap('');
> > > $node = $anc_node;
> > >    }
> > >    $out->write_tree($t); #->good this time
> > > }
> > >
> > >
> > > __DATA__
> > > (A:52,(B:46,C:50)68:11,D:70);
> > >
> > >
> > > Here is the output:
> > >
> > > (A:52,(B:46,C:50)68:11,D:70);
> > > ((C:50,(A:52,D:70):11)68:46)B;
> > > (B:46,C:50,(A:52,D:70):11)68;
> > > (B:46,C:50,(A:52,D:70)68:11);
> > >
> > >
> > > Tree #2 and #3 have the score 68 moved to the wrong
> > > node, while tree #4 is OK. (BTW tree #2 is really
> > > weird, except if B, is the real ancestor (a fossil ?),
> > > it really does not make much sense to me).
> > >
> > > My understanding here is that the problem is linked to
> > > the well-known difficulty to differentiate node from
> > > branch labels in newick trees. Bootstrap scores are
> > > branch attributes not node attributes, but since
> > > Bio::TreeI has no branch/edge/bipartition object they
> > > are attached to a node, and in fact reflects the
> > > bootstrap score of the ancestral branch leading to that
> > > node. Troubles naturally come when you are dealing with
> > > an unrooted tree or reroot a tree: a child can become
> > > an ancestor, and, if the bootstrap scores is not moved
> > > from the old child to the new child, it will end up
> > > attached at the wrong place (i.e. wrong node).
> > >
> > > I see several fix to that:
> > >
> > > 1- incorporate Bank's fix into the root() method. I.e.
> > > if there is bootstrap score, after re-rooting, the one
> > > on the old to new ancestor path, should be moved to the
> > > right node.
> > >
> > > 2- Modify the way trees are stored in bioperl to
> > > incorporate branch/edge/bipartition object, and move
> > > the bootstrap scores to them. That won't be easy and
> > > will break many things...
> > >
> > >
> > > What do you think?
> > >
> > > --Tristan
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>



More information about the Bioperl-l mailing list