[Bioperl-l] Bio::*Taxonomy* changes
Hilmar Lapp
hlapp at gmx.net
Tue Jul 25 09:47:47 EDT 2006
On Jul 25, 2006, at 3:05 AM, Sendu Bala wrote:
> [...]
> ## the fully-manual way
> my $species = new Bio::Species;
> my $node = new Bio::Taxonomy::Node(-name => 'Saccharomyces
> cerevisiae',
> -rank => 'species', -object_id
> => 1,
> -parent_id => 2);
If this is meant as an example for the use cases I enumerated, then
you wouldn't have the parent_id from a Genbank file. However, you
didn't have that before either, so no problem.
> my $n2 = new Bio::Taxonomy::Node(-name => 'Saccharomyces',
> -object_id => 2, -parent_id => 3);
> # (no assumption that 'Saccharomyces' is the genus, so rank()
> undefined)
I think in a confident parse you want to assign 'genus' if there's
little doubt, for example 'Saccharomyces cerevisiae'. Not sure
whether there are weird viri whose names look innocuous but in
reality the name doesn't follow binomial convention.
> my $n3 = [etc]
> $species->add_node($node);
> $species->add_node($n2);
I know why you are doing this, but seeing this people will hit a
mental snag. You should listen to Chris' refusal to see the sense in
this as an indication that many people down the road won't see the
sense either.
So instead, make the logical model in your design more obvious, which
I think ultimately will help maintainability as well. For example:
my $taxonomy = Bio::Taxonomy->new();
my $node = new Bio::Taxonomy::Node(-name => 'Saccharomyces cerevisiae',
-rank => 'species', -object_id
=> 1,
-parent_id => 2);
my $n2 = new Bio::Taxonomy::Node(-name => 'Saccharomyces',
-object_id => 2, -parent_id => 3);
$taxonomy->add_node($node);
$taxonomy->add_node($n2);
my $species = Bio::Species->new(-lineage => $taxonomy);
print $species->binomial();
print $species->genus();
# this may trigger a lookup if a taxonomy db handle has been set, e.g.:
# $taxonomy->db_handle(Bio::DB::Taxonomy->new(-source => 'entrez'));
print $species->classification();
> [etc]
>
> ## Using a factory without db access
> # assume that Bio::Taxonomy::GenbankFactory implements
> # some modified Bio::Taxonomy::FactoryI
> my $factory = Bio::Taxonomy::GenbankFactory->new();
> my $species = $factory->generate(-classification => ['Saccharomyces
> cerevisiae', 'Saccharomyces',
> 'Saccharomycetaceae' ...]);
> # the generate() method above just does the fully-manual way for you
Except the method name would be create_object(), the parameter would
be a hash ref, and the return value would be a Bio::TaxonomyI
compliant object:
my $taxonomy = $factory->create_object({-classification =>
['Saccharomyces
cerevisiae', 'Saccharomyces',
'Saccharomycetaceae' ...]});
my $species = Bio::Species->new(-lineage => $taxonomy);
>
> ## Using a factory with db access
> # assume that Bio::Taxonomy::EntrezFactory implements some
> # modified Bio::Taxonomy::FactoryI and uses Bio::DB::Taxonomy::entrez
> # to get the nodes
> my $factory = Bio::Taxonomy::EntrezFactory->new();
The logic where to do a lookup on should not be duplicated here. It
only belongs under Bio::DB::Taxonomy::*.
> my $species = $factory->fetch(-scientifc_name => 'Saccharomyces
> cerevisiae');
Likewise, use the methods defined in Bio::DB::Taxonomy, and again,
the return type is Bio::Taxonomy, which you would pass to
Bio::Species->new().
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the Bioperl-l
mailing list