[Bioperl-l] Species name validation problem

Hilmar Lapp hlapp at gmx.net
Mon Mar 27 18:29:40 UTC 2006


I agree. can you file this on bugzilla as a feature request, basically  
copy&pasting your email below?

On Mar 27, 2006, at 10:24 AM, David Waner wrote:

> Yes, I meant to type Bio::Species, not Bio::Seq. Sorry for the
> confusion.
>
> My problem is that I am not calling $species->classification()  
> directly;
> I am calling Bio::Species->new(), which in turn calls classification()
> which calls validate_species_name(), which then throws an exception on
> some species names.  As far as I can see, there is no way to turn off
> this (over-aggressive) validation in the Species constructor.
>
> I guess that instead of this:
>
> 	$species = Bio::Species->new(-classification =>
> \@classificationArray);
>
> I could do this:
>
> 	$species = Bio::Species->new();
> 	$species->classification(\@classificationArray, 'no
> validation');
> 	
> but it would make a nicer interface to have a validation option in the
> Species constructor.
>
> - David
>
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: Friday, March 24, 2006 9:42 PM
> To: David Waner
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Species name validation problem
>
>
> The option would be in Bio::Species, not Bio::Seq. You can circumvent
> the name validation by passing an array ref to
> $species->classification() and anything that evaluates to true as the
> second argument. This is for instance what the genbank parser does
> (which doesn't mean that it is always correct); supposedly the  
> swissprot
> parser ought to do the same.
>
>    -hilmar
>
> On 3/24/06, David Waner <dwaner at scitegic.com> wrote:
>> I have found that Bio::Seq->new() throws exceptions on some "species"
>> names containing special characters, or consisting of a single letter,
>> e.g:
>>
>>         SwissProt: POLN_ONNVG   O'nyong-nyong virus
>>         SwissProt: FIBP_ADE1H   Human adenovirus 15/H9
>>         SwissProt: POLG_FMDVZ   Foot-and-mouth disease virus (strain
>> A22/550 Azerbaijan 65)
>>         SwissProt: RIR1_BHV1C   Bovine herpesvirus 1.1
>>         SwissProt: SODF_METJ    Methylomonas J
>>         GenBank: AJ416726               Stylosanthes aff. calcicola
>>
>> It seems that the regex in validate_species_name() is too restrictive,
>
>> but I can't find a way to turn off validation without editing bioperl
>> modules.  There has been some recent discussion of this issue on the
>> mailing list (see below).  Does anyone know if or when a
>> -validate_species option to Bio::Seq->new() will be added? Or should I
>
>> just propose the code change?
>>
>> Thanks,
>>   David Waner
>>
>>
>>> Stefan Kirov skirov at utk.edu
>>> Wed Sep 21 08:46:05 EDT 2005
>>>
>>>
>> ----------------------------------------------------------------------
>> --
>> --------
>>>
>>> Thanks for the great answer Hilmar!
>>> I would prefer to have some kind of a check if the user wishes so.
>>> For
>>
>>> example Entrezgene file contains some HTML tags in some entries
>> species
>>> names which is good to know.
>>> I will put an option -validate_species in the constructor to turn
>>> the check on and off. Maybe a species filter can be of some use as
>>> well. though you can just select the correct file from the NCBI
>>> site.... Thanks again! Stefan
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
>
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --  
> Click on the link below to report this email as spam
> https://www.mailcontrol.com/sr/6RxreR3!4EAT093Sa0o+kL74sPfAD2rj2Jp! 
> eGk8r
> RtXfcIn+KX87A70BrDI0qIcMansH9FDdvd7u5Zc1G6CuaLdquPg4xnr+tcULmTIZgnhNIFU 
> k
> MNJWsODXSRTEtZF6To1umzAv! 
> mlBBYJW4WXOZWaK8xzZrmj3Eao8o3D4YNM7jMpLnqnc7LtK
> 9D9H+YhmDk7r9DMVd5h6cTMU3rPx7Z43oVxeMeC
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------





More information about the Bioperl-l mailing list