[Bioperl-l] problem while parsing UniProt(ltaxon.pm)

Chris Fields cjfields at uiuc.edu
Thu Mar 29 14:18:42 UTC 2007


On Mar 29, 2007, at 8:41 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> Here's one accession which reproduces this: Q7Y720.  There is an  
>> additional component to the error that I find:
>> Use of uninitialized value in pattern match (m//) at /Users/ 
>> cjfields/src/bioperl-live/Bio/SeqIO/swiss.pm line 1060, <GEN0>  
>> line 13.
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,  
>> Heteroconchia, Veneroida, Veneroidea, Veneridae, Venerupis,  
>> Ruditapes, Venerupis' had two non-consecutive nodes with the same  
>> name. Can't cope!
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/bioperl-live/Bio/ 
>> Root/Root.pm:359
>> STACK: Bio::DB::Taxonomy::list::add_lineage /Users/cjfields/src/ 
>> bioperl-live/Bio/DB/Taxonomy/list.pm:157
>> STACK: Bio::DB::Taxonomy::list::new /Users/cjfields/src/bioperl- 
>> live/Bio/DB/Taxonomy/list.pm:94
>> STACK: Bio::DB::Taxonomy::new /Users/cjfields/src/bioperl-live/Bio/ 
>> DB/Taxonomy.pm:103
>> STACK: Bio::Species::classification /Users/cjfields/src/bioperl- 
>> live/Bio/Species.pm:180
>> STACK: Bio::SeqIO::swiss::_read_swissprot_Species /Users/cjfields/ 
>> src/bioperl-live/Bio/SeqIO/swiss.pm:1073
>> STACK: Bio::SeqIO::swiss::next_seq /Users/cjfields/src/bioperl- 
>> live/Bio/SeqIO/swiss.pm:247
>> STACK: tax.pl:11
>> -----------------------------------------------------------
>> The problem appears to be with the OS source organism line in  
>> swiss files, which looks like is being parsed incorrectly for  
>> these.  Here is the relevant section:
>> OS   Venerupis (Ruditapes) philippinarum.
>> OG   Mitochondrion.
>> A UniProt query limited to taxonomy using 'Venerupis' produces  
>> several more.  This only affects swissprot; embl and genbank files  
>> with similar source lines do not have the same problem.
>
> Thanks. I've made a tentative fix to swiss.pm. The only problem  
> might be common names/ descriptions don't get caught on some  
> strange OS lines. I don't have enough experience of OS lines to  
> know what they might look like.
>
> Still, at least there won't be thrown exceptions, which some users  
> may prefer ;)
>
> I'll add tests later if and when Ambrose/ yourself confirm all is  
> well.

I'm getting it to parse but there is a '.' appended to the  
scientific_name():

Venerupis (Ruditapes) philippinarum.

which appears in the classification:

Venerupis (Ruditapes) philippinarum.; Ruditapes; Venerupis;  
Veneridae; Veneroidea; Veneroida; Heteroconchia; Bivalvia; Mollusca;  
Metazoa; Eukaryota;

chris





More information about the Bioperl-l mailing list