[Bioperl-l] getting/setting species names with Bio::Species

Mark A. Jensen maj at fortinbras.us
Fri Jan 15 11:10:02 EST 2010


excellent summary--thanks!!
----- Original Message ----- 
From: "Chris Fields" <cjfields at illinois.edu>
To: "Mark A. Jensen" <maj at fortinbras.us>
Cc: "BioPerl List" <bioperl-l at lists.open-bio.org>
Sent: Friday, January 15, 2010 11:00 AM
Subject: Re: [Bioperl-l] getting/setting species names with Bio::Species


>> FWIW, I'd prefer "binomial" = "genus" . "species"
>
>
> That's the way Bio::Species is supposed to work, at least when it was 
> refactored by Sendu.  But just a note: Bio::Species was considered deprecated 
> (scheduled for the 1.7 release IIRC) for many very good reasons in favor of 
> Bio::Taxon.  First and foremost among these is the fact we cannot consistently 
> parse out the genus/species/strain/variant/etc for every organism in GenBank 
> w/o knowing it's full lineage, which means including some taxonomic 
> information.  And even then it's highly problematic.
>
> We've had several heated discussions on list about how to handle this in a 
> somewhat backwards-compatible way, and the main solution was to forego 
> compatibility issues altogether and eventually deprecate Bio::Species 
> altogether in favor of Bio::Taxon, a class that doesn't make the same 
> assumptions.  Bio::Species, in the interim, is-a Bio::Taxon.  You'll note that 
> a minimal Bio::DB::Taxonomy instance is constructed from the classification 
> scheme in some instances, but if one had a proper DB link one could link to 
> Entrez Taxonomy or a local flat file indexes DB and grab the info.  Bio::Taxon 
> (correct me if I'm wrong on this Sendu, if you're out there) eschews various 
> methods (species, etc) for simpler consistent ones based on Taxonomy, and 
> doesn't force us to handle every exception to getting the genus/species out of 
> a name.  That is left up to the user, at their peril.
>
> For either one, if you are reproducing the fully qualified name, you probably 
> should use something like node_name() for consistency.  Bio::Species also has 
> scientific_name().  With a true Bio::Taxon one would need to be check this is 
> performed on the species node.
>
> chris
>
> On Jan 15, 2010, at 9:31 AM, Mark A. Jensen wrote:
>
>> I'm not that familiar with Bio::Species either, but this looks
>> like conflicting semantics betwen Bio::Species and Bio::SeqIO.
>> Bio::SeqIO sets the species accessor to the 'species' element of
>> the lineage array, I believe.
>> FWIW, I'd prefer "binomial" = "genus" . "species"
>> MAJ
>> ----- Original Message ----- From: "Dave Messina" <David.Messina at sbc.su.se>
>> To: "BioPerl List" <bioperl-l at lists.open-bio.org>
>> Sent: Friday, January 15, 2010 10:17 AM
>> Subject: [Bioperl-l] getting/setting species names with Bio::Species
>>
>>
>>> Hi everybody,
>>>
>>> I'm having a little trouble with names in Bio::Species objects.
>>>
>>> According to the Bio::Species documentation, if I have a species name as a 
>>> string, like "Homo sapiens", I can get and set that using the species 
>>> method:
>>>
>>> my $my_species_obj = Bio::Species->new();
>>> $my_species_obj->species('Homo sapiens');
>>>
>>> print $my_species_obj->species;     # 'Homo sapiens'
>>>
>>>
>>> That works fine if I create the Bio::Species object myself.
>>>
>>> But if I try to get that string back out from a BIo::Species object created 
>>> by SeqIO from a genbank file, I get just 'sapiens' back:
>>>
>>> my $io = Bio::SeqIO->new('-format' => 'genbank',
>>>                        '-file'   => 'hoxa2.gb');
>>> my $seq_obj = $io->next_seq;
>>> my $io_species_obj = $seq_obj->species;
>>>
>>> print $io_species_obj->species;     # 'sapiens'
>>>
>>>
>>> I think that happens because genbank records have more taxonomic info about 
>>> the species name, like the genus (and in fact the whole taxonomic 
>>> categorization: kingdom phylum order, etc). So the genus is stored 
>>> separately.
>>>
>>> Poking around a bit more in Bio::Species, I turned up the method 'binomial', 
>>> which appears to do the right thing, returning genus and species in both 
>>> cases. Except, as you can see, the space is stripped out for my 
>>> species-name-is-just-a-string object:
>>>
>>> print $my_species_obj->binomial;    # 'Homosapiens'
>>> print $io_species_obj->binomial;    # 'Homo sapiens'
>>>
>>>
>>> I'm not very familiar with Bio::Species (and its parent Bio::Taxon); am I 
>>> using it correctly above, or is there a better way?
>>>
>>> If not, this kinda looks like a bug to me. I've got a patch which works and 
>>> passes the BioPerl test suite.
>>>
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list