[Bioperl-l] Bio::*Taxonomy* changes
    Sendu Bala 
    bix at sendu.me.uk
       
    Tue Jul 18 03:27:49 EDT 2006
    
    
  
Hilmar Lapp wrote:
> I don't think we should differ from NCBI in places where the  
> connection between a method name and the NCBI data file is obvious or  
> otherwise we will confuse people and send them into traps.
> 
> $node->scientific_name() should simply report what NCBI reports. For  
> simple species this will be identical to what $node->binomial()  
> returns, but for others it may not, e.g., strains, varieties, etc or  
> the weird world of viri and bacteria.
Ok, well this certainly seems to be consensus so I'll abide.
> This will also absolve us from retaining the business logic for how  
> to construct the scientific name from genus, species, and possibly  
> strain or whatever.
What about the existing genus(), species(), sub_species() and variant() 
methods? There would be no need for any logic to join things together, 
but I would still like to be able to get just 'sapiens' from somewhere. 
Can I use species() for that purpose (though again, species is strictly 
'Homo sapiens')? Likewise sub_species() and variant() could hold the 
remaining non-redundant names. Or should all of these be deprecated 
because they don't really have a place in a generic Node class?
What about node_name()? Yet another synonym of scientific_name? (right 
now it grabs the common name(s)). Ugh.
What should I do with the classification array? Should it hold the raw 
ScientificName like:
join(',', $node->classification) eq 'Homo sapiens, Homo, 
Homo/Pan/Gorilla group [...]'?
Or should it be like:
join(',', $node->classification) eq 'sapiens, Homo, Homo/Pan/Gorilla 
group [...]'?
The latter is how it currently works (when it works correctly); I would 
rather fix it than lose the logic completely, but if we're staying true 
to proper classification (vs. what a programmer might expect), I guess I 
must use the raw ScientificName?
> binomial() isn't part of the NCBI taxonomy definition, so you have  
> freedom there to report what suits you.
I don't think binomial() would serve any useful purpose now, however. I 
can either deprecate it or make it a synonym of scientific_name() or 
both. Or binomial() can be a version of scientific_name() that complains 
if you use it on a rank higher or lower than species. As for species() 
et al., it may have no place in a generic Node class. Thoughts?
    
    
More information about the Bioperl-l
mailing list