[BioSQL-l] Re: gene ontology questions (bug)

Raphael A. Bauer Raphael.Bauer at informatik.hu-berlin.de
Thu Sep 18 08:36:13 EDT 2003


Hi...
i've got the same problems as Marc, and i wonder if there is a solution yet.

Command is:
perl load_ontology.pl --host localhost --dbname bioseqdbspgo --dbuser rb 
--driver Pg --namespace "Gene Ontology" --format goflat --fmtargs 
"-defs_file,GO.defs" function.ontology process.ontology component.ontology

Output is:
Parsing input ...
Loading ontology Gene Ontology:
         ... terms
Could not store GO:0001529 (elastin):

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be 
found by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK (eval) load_ontology.pl:489
STACK toplevel load_ontology.pl:471

--------------------------------------


Quite Strange...
My Bio* things are all the latest releases (BioPerl 1.2.2)
For GO i use the files released September 16, 2003....
..

I think the problem is the Go.defs File:

term: elastin
goid: GO:0001528
definition: OBSOLETE. A major structural protein of mammalian connective 
tissues; composed of one third glycine, and also rich in proline, 
alanine, and valine. Chains are cross-linked together via lysine residues.
definition_reference: ISBN:0198506732
comment: This term was made obsolete because it represents a gene 
product. To update annotations, use the molecular function term 
'extracellular matrix constituent conferring elasticity activity ; 
GO:0030023'.

term: elastin
goid: GO:0001529
definition: OBSOLETE (was not defined before being made obsolete).
definition_reference: GO:mah
comment: This term was made obsolete because it represents a gene 
product. To update annotations, use the molecular function term 
'extracellular matrix constituent conferring elasticity activity ; 
GO:0030023'.

with two times "elastin".. (it seems that there are many terms that have 
the same term name.. also seen in term collagen and so on...)

and the definition of table term that forbids 2 times the same name(unique):

Indexes: term_pkey primary key btree (term_id),
          term_identifier_key unique btree (identifier),
          term_name_key unique btree (name, ontology_id),
          term_ont btree (ontology_id)

(Marc already mentioned this...)

A dirty workaround would be to rename the term names in GO.defs in case 
there are two identical names (one elastin and the other elastin CHANGED 
or so..)
.. but is there any recommondation on how to handle the problem safely?

Thanks a lot...

Raphael



More information about the BioSQL-l mailing list