[BioSQL-l] gene ontology questions revisited

Hilmar Lapp hlapp at gnf.org
Fri Sep 19 13:37:28 EDT 2003


On 9/19/03 5:51 AM, "Daniel Lang" <daniel.lang at biologie.uni-freiburg.de>
wrote:

> But another one occurred while loading the data:
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::DBLinkAdaptor (driver) failed, values were
> ("MetaCyc","2-PYRONE-4\,6-DICARBOXYLATE-LACTONASE-RXN","0") FKs ()
> ERROR:  value too long for type character varying(40)
> ---------------------------------------------------

The problem here is that the references for GO terms are modeled as DBXrefs
with dbname and accession. This sometimes applies quite well, but often the
reference in the GO.defs file is used in a far wider sense. In the example
above for instance, the reference is in fact to a term in another ontology
(MetaCyc), so should be a term relationship rather than a reference.

So, what you're seeing is the result of deficiencies in the flat file
representation (term references can be any of lit.reference, dbxref, and
ontology term) and consequently in the parser (who doesn't try to be smarter
than the flat file representation).

Unfortunately that assessment doesn't help you much. What I did locally (I
obviously ran into the same problem) is widening the accession column in
dbxref to 64 chars, which is I thought a somewhat reasonable compromise. You
don't want to open it up completely and water down the relational model just
because a certain flat file format is deficient in its expressivity). This
doesn't fix the problem that something ends up as a dbxref when it should
rather be a term relationship.

Anyone else got a good idea here? I'm cc'ing the bioperl list since this is
rather an issue of the object-space representation than one of the schema.

    -hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------





More information about the BioSQL-l mailing list