[Bioperl-l] problem while parsing UniProt(ltaxon.pm)

Chris Fields cjfields at uiuc.edu
Thu Mar 29 12:12:43 UTC 2007


>> Here you are with the error message
>>
>> Q0QAY1_9DIPT
>> Q0QAY7_9DIPT
>> Q0QB51_9DIPT
>> Q0QB52_9DIPT
>> Q0QB62_9DIPT
>> Q0QB63_9DIPT
>>
>> ------------- EXCEPTION  -------------
>> MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,
>> Heteroconchia,
>> Veneroida, Veneroidea, Veneridae, Venerupis, Ruditapes, Venerupis'  
>> had
>> two
>> non-consecutive nodes with the same name. Can't cope!
>> STACK Bio::DB::Taxonomy::list::add_lineage
>> /usr/local/ActivePerl/site/lib/Bio/DB/Taxonomy/list.pm:157
>
> Please send me the actual record that causes the exception and I'll  
> see
> what I can do about fixing the problem.

Sendu,

Here's one accession which reproduces this: Q7Y720.  There is an  
additional component to the error that I find:

Use of uninitialized value in pattern match (m//) at /Users/cjfields/ 
src/bioperl-live/Bio/SeqIO/swiss.pm line 1060, <GEN0> line 13.

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The lineage 'Eukaryota, Metazoa, Mollusca, Bivalvia,  
Heteroconchia, Veneroida, Veneroidea, Veneridae, Venerupis,  
Ruditapes, Venerupis' had two non-consecutive nodes with the same  
name. Can't cope!
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/bioperl-live/Bio/ 
Root/Root.pm:359
STACK: Bio::DB::Taxonomy::list::add_lineage /Users/cjfields/src/ 
bioperl-live/Bio/DB/Taxonomy/list.pm:157
STACK: Bio::DB::Taxonomy::list::new /Users/cjfields/src/bioperl-live/ 
Bio/DB/Taxonomy/list.pm:94
STACK: Bio::DB::Taxonomy::new /Users/cjfields/src/bioperl-live/Bio/DB/ 
Taxonomy.pm:103
STACK: Bio::Species::classification /Users/cjfields/src/bioperl-live/ 
Bio/Species.pm:180
STACK: Bio::SeqIO::swiss::_read_swissprot_Species /Users/cjfields/src/ 
bioperl-live/Bio/SeqIO/swiss.pm:1073
STACK: Bio::SeqIO::swiss::next_seq /Users/cjfields/src/bioperl-live/ 
Bio/SeqIO/swiss.pm:247
STACK: tax.pl:11
-----------------------------------------------------------

The problem appears to be with the OS source organism line in swiss  
files, which looks like is being parsed incorrectly for these.  Here  
is the relevant section:

OS   Venerupis (Ruditapes) philippinarum.
OG   Mitochondrion.

A UniProt query limited to taxonomy using 'Venerupis' produces  
several more.  This only affects swissprot; embl and genbank files  
with similar source lines do not have the same problem.

chris






More information about the Bioperl-l mailing list