[Bioperl-l] Unwise elimination of nodes in B:T:Node::remove_Descendent?

Hilmar Lapp hlapp at gmx.net
Fri Feb 6 14:21:31 UTC 2009


On Feb 6, 2009, at 12:49 AM, Mark A. Jensen wrote:

> A   B   C   D   E
> |5  |5  |4  |4  |
> \   /   \   /   /
>  x       y    /
>  |2      |1  /
>  \       /  /
>   \    _/  / 10
>    \  /   /
>     z    /
>     |3  /
>      \ /
>      ROOT
>
>> [...]
>> __DATA__
>> (((A:5,B:5)x:2,(C:4,D:4)y:1)z:3,E:10);
>> [...]
>
> The first problem was that the "complementary approaches" didn't give
> the same answer, and one threw an error. This is the bug in the
> code above. If you look at the tree, the nodes he really wants to  
> keep are
> the leaves [A,B,E], *plus* the internal nodes [x,y,z]; that is...
>
> $tree->splice(-keep_id=>[A,B,E,x,'y',z])
> $tree->total_branch_length
>
> doesn't throw and returns 25, which is correct.

If you replace 'y' with the root in the above, yes. Otherwise no.

> The second problem is that removing [C, D] also gives him 25, which is
> what he wanted, but is not correct.

Correct.

> [...]The problem arises in this code at the end of remove_Descendent:
>
>  # remove unecessary nodes if we have removed the part |||Node.pm  
> LINE 272
>  # which branches.
>    my $a1 = $self->ancestor;
>    if( $a1 ) {
>        my $bl = $self->branch_length || 0;
>        my @d = $self->each_Descendent;
>        if (scalar @d == 1) {
>     $d[0]->branch_length($bl + ($d[0]->branch_length || 0));
>     $a1->add_Descendent($d[0]);
>     $a1->remove_Descendent($self);
>        }
>    }
>
> When node C is removed from the example tree, the node 'y' is removed
> by this code apparently as a convenience.

I guess I'm confused by the above code snippet. If @d are the  
descendants of $a1, then I don't understand what the purpose of adding  
$d[0] as a descendant of $a1 is (after altering its branch length).  
Furthermore, if $a1 is the ancestor of $self, and $a1 has only one  
descendant, wouldn't that mean that $d[0] == $self?

What am I missing?

I also think it's a bad idea to simplify the tree (or collapse  
internal nodes of degree 1) as an implicit operation. It should be  
explicit (and I believe there is a method simplify() or something  
similar, isn't there? Ah - I see you quote contract_linear_paths()).

Furthermore, I think it's also a bad idea to remove descendant leaf  
nodes if you remove an internal node. What if you really just wanted  
to remove the internal node because, for example, its branching point  
isn't well supported? So removing node 'y' should make z a node of  
degree 3, but not remove C and D unless you ask to remove the subtree  
beginning at 'y'.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================






More information about the Bioperl-l mailing list