[Bioperl-l] Bio::DB::Taxonomy, Bio::Tree, and how to combine trees

Tristan Lefebure tristan.lefebure at gmail.com
Fri Jan 25 03:07:52 UTC 2008


Hi,

I'm just starting to play with Bio::DB::Taxonomy and Bio::Tree, and I would 
like to merge several "one leaf taxonomic trees" into a taxonomic tree with 
several leafs. For example:

#####BEGINNING#####
#! /usr/bin/perl

use strict;
use warnings;
use Bio::DB::Taxonomy;
use Bio::TreeIO;

# The taxonomic database
# You might want to switch to a different flatfile or to Entrez 
my $dbh = new Bio::DB::Taxonomy(-source   => 'flatfile',
                                  -directory=> '/tmp',  
                                  -nodesfile=> '/home/tristan/Documents/db/NCBI/taxonomy/nodes.dmp', 
                                  -namesfile=> '/home/tristan/Documents/db/NCBI/taxonomy/names.dmp');

# Fetch 4 taxa for the example
my $tax_decapoda =  $dbh->get_taxon(-name => 'Decapoda');
my $tax_heteroptera =  $dbh->get_taxon(-name => 'Heteroptera');
my $tax_coleoptera =  $dbh->get_taxon(-name => 'Coleoptera');
my $tax_copepoda =  $dbh->get_taxon(-name => 'Copepoda');

# Transform to tree objects
my $decapoda_tree = new Bio::Tree::Tree(-node => $tax_decapoda);
my $heteroptera_tree = new Bio::Tree::Tree(-node => $tax_heteroptera);
my $coleoptera_tree = new Bio::Tree::Tree(-node => $tax_coleoptera);
my $copepoda_tree = new Bio::Tree::Tree(-node => $tax_copepoda);

# Reduce the number of nodes to the following ranks
my @ranks = qw(kingdom phylum subphylum superclass class subclass superorder 
order family);

$decapoda_tree->splice(-keep_rank => \@ranks);
$heteroptera_tree->splice(-keep_rank => \@ranks);
$coleoptera_tree->splice(-keep_rank => \@ranks);
$copepoda_tree->splice(-keep_rank => \@ranks);

# Print the trees
my $out = new Bio::TreeIO('-format' => 'newick',
                                   '-file'   => ">four.tree");
$out->write_tree($decapoda_tree);
$out->write_tree($heteroptera_tree);
$out->write_tree($coleoptera_tree);
$out->write_tree($copepoda_tree);

#####END#######

This gives the following "trees":
(((((7524)33340)50557)6960)6656)33208;
(((((7041)33340)50557)6960)6656)33208;
((((((6683)6682)72041)6681)6657)6656)33208;
((((6830)72037)6657)6656)33208;

They are really special trees, as they contain only one leaf. I would like to 
combine them and remove the 'unused' nodes to obtain something like that:

((7524,7041)33340,(6683,6830)6657)6656;

or even better:

((Hemiptera,Coleoptera)Neoptera,(Decapoda,Copepoda)Crustacea)Arthropoda;

Any suggestions?

Thanks!

-Tristan




More information about the Bioperl-l mailing list