[BioSQL-l] more consistency

Hilmar Lapp hlapp at gnf.org
Wed Mar 12 01:15:11 EST 2003


On Wednesday, March 12, 2003, at 12:51  AM, Yves Bastide wrote:

> By the way, a rationale for the Singapore changes would be great for 
> us users.

Aaron has written a nice document describing the schema and some of the 
ideas behind it for the audience of a 'small lab' that wants to store 
and manager their sequences. I don't think he emphasized the Singapore 
changes and the rationale behind them though. Aaron?

> E. g., why the split between ontology and (ontology_)term?

Category being a loop back to term was poor design that just happened 
to work nicely despite of being poor :)

Ontology is the namespace for a term, which really is not a term, even 
though sometimes it resembles one (and you could even think about an 
ontology of ontologies). We collectively decided that creating a new 
table instead of re-using bionamespace was the 'right' thing to do.

>   Why do reference use dbxref (as a one to one relationship, so one 
> cannot store both Pubmed and Medline ids)? Etc.
>

Good point actually. The reason they now have a FK to dbxref was that 
upon discussing what would be the proper generic name for the document 
database ID we concluded that in fact this is just a dbxref as any 
other, so why not make it one then. I think this was a good decision. 
The reason it is also a UK is that document_id (or medline_id) was a UK 
before. If you want multiple dbxrefs per reference, you'd need an 
association table, which means the UK constraint goes out the window 
(i.e., it is not straight-forward [=impossible in MySQL] to enforce 
that for one medline ID there is only one reference entry). Possible, 
but makes me wary.

What is no problem with the present schema is to have an arbitrary, but 
specified, database associated with the document ID (and even a version 
if you wanted to). So, you can store either medline ID *or* pubmed ID, 
and you would easily know which one you chose (which is different, and 
richer, than before).

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------



More information about the BioSQL-l mailing list