[Bioperl-l] Bio::*Taxonomy* changes

Hilmar Lapp hlapp at gmx.net
Mon Jul 24 15:56:02 EDT 2006


On Jul 24, 2006, at 3:24 PM, Chris Fields wrote:

>
>> Hilmar Lapp wrote:
>>> Sounds good to me, except there is no Bio::TaxonomyI yet,
>>
>> Indeed, I propose making one.
>
> So, Node would implement this, correct?

No -

> Naming it Bio::TaxonomyI makes me
> think that Bio::Taxonomy implements TaxonomyI, not that  
> Bio::Taxonomy::Node
> implements it.

I'd suppose so.

>> Yes, which is why Bio::Taxonomy is appropriate here. Assuming that
>> Bio::Species isa Bio::TaxonomyI:
>>
>> ...
>> SOURCE      Saccharomyces cerevisiae (baker's yeast)
>>     ORGANISM  Saccharomyces cerevisiae
>>               Eukaryota; Fungi; Ascomycota; Saccharomycotina;
>>               Saccharomycetes;
>>               Saccharomycetales; Saccharomycetaceae; Saccharomyces.
>>
>> ...
>>
>> ## the fully-manual way
>> my $species = new Bio::Species;
>> my $node = new Bio::Taxonomy::Node(-name => 'Saccharomyces  
>> cerevisiae',
>>                                     -rank => 'species', -object_id  
>> => 1,
>>                                     -parent_id => 2);
>> my $n2 = new Bio::Taxonomy::Node(-name => 'Saccharomyces',
>>                                   -object_id => 2, -parent_id => 3);
>> # (no assumption that 'Saccharomyces' is the genus, so rank()  
>> undefined)
>> my $n3 = [etc]
>> $species->add_node($node);
>> $species->add_node($n2);
>> [etc]
>
>
> Hrmm... why would you add multiple nodes to a species object?  A  
> Species
> is-a Node, not a full Bio::Taxonomy.

No. See above: Bio::Species is-a Bio::Taxonomy.

> Taxonomy has-a Node (hence the
> add_node() method).  So, you should be able to add a NodeI- 
> implementing
> object to a Taxonomy object (either a Node or a Species).

Let's keep Bio::Species and Taxonomy::Node separate. They look like  
representing something similar but once you look at the Bio::Species  
API (and a Genbank record) you realize they do not. Bio::Species is  
more like an entire lineage and the species node all flattened out  
into one.

I'm not sure Bio::Species would need to implement a Bio::TaxonomyI  
interface; it may as well just use an implementation of it  
internally. I'm not sure how Sendu wants to design this, but for sure  
Bio::Taxonomy::Node should not be a Bio::Species, and the reverse  
should rather be avoided too.


>> [..]
>> The way to do it is to have the Bio::DB::Taxonomy* modules return  
>> only
>> the information that a Bio::Taxonomy::FactoryI would need to make a
>> NodeI. The specific Factory that you use could generate whatever  
>> type of
>> Node you wanted.
>
> Yes, using an object factory here makes a lot of sense, returning the
> correct object type based on the rank.

Well, I don't think you'd want to create instances of different node  
classes depending on the rank of the node. However, a particular  
factory implementation may of course be free to do exactly that.

> ...
>> Bio::Species differs from Bio::Taxonomy only so it contains all the
>> legacy methods names that Bio::Species currently has, for backward
>> compatibility. Setting $species->classification() would delete all  
>> nodes
>> of self, use a GenbankFactory to make a new Bio::Species, then  
>> pull out
>> all its Nodes and add them to self.
>
> The idea is to replace Bio::Species with something that works well, so
> having it implement a Node-like interface works since it is-a  
> Node.  Having
> it implement a Taxonomy-like interface, though, doesn't make a lot  
> of sense
> as a species is-not-a Taxonomy.  It should act just like a fancier  
> node
> object.

No, I'd really recommend against muddling up a taxonomy node model  
with the Bio::Species legacy model.

Bio::Species is not a node at all. You may argue it's not a taxonomy  
either. This is just one more reason for containing the Bio::Species  
contagious disease of conflating disjoint concepts into one.

>
> Using a factory in Bio::DB::Taxonomy should solve any issues about  
> what
> object type is returned, since that could simply be made based on  
> the rank
> itself (species rank or below == Bio::Taxonomy::Species, genus and  
> above ==
> Bio::Taxonomy::Node).

Bio::Taxonomy::Species was an invention of mine and - if created -  
should not be used for anything else other than representing a  
taxonomy node as a Bio::Species object iff necessary (i.e., if the  
client really wants a Bio::Species object).

I'd actually like to see what Sendu would come up with. It sounds at  
the very minimum like an excellent start.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================







More information about the Bioperl-l mailing list