[BioSQL-l] database extensions
Hilmar Lapp
hlapp at gmx.net
Sun Aug 6 18:32:43 UTC 2006
Hi Angel, sorry for the belated response, I was at BOSC. See my
comments below.
On Aug 3, 2006, at 2:28 PM, Angel Pizarro wrote:
> Hello,
>
> Relatively new to biosql, but I was wondering about a few aspects
> of the
> schema/project.
>
> First, about the ontology tables, what is the preferred way to map
> ontology annotations to bioentries? via a seqfeature? Currently I just
> added a new table to map GO associations with the evidence code from
> GOA. Not optimal as there may be multiple lines of evidence for an
> association, as in the godatabase schema.
You link ontology terms to bioentries through the
bioentry_qualifier_value table, i.e., as a value-less term association.
If you want to capture the evidence code for GO then associations
then you can use the value field in bioentry_qualifier_value to hold
the code. This indeed won't very well if there are multiple evidence
codes.
You could collapse them into one delimited string but that will
impair your ability to constrain searches by evidence code. However,
a LIKE constraint instead of string equality may not make a big
difference since typically the value column isn't indexed anyway
since you may have big values there. At any rate, if you do have
multiple evidence codes and you do want to constrain searches by
evidence code then there needs to be a better solution.
>
> Second, are primary keys up for discussion any time soon? I realize
> that
> a lot of external projects rely on this schema, so it has to remain
> stable, but the inconsistent use of UID, compound keys or even lack
> of a
> key really put a hindrance on the use of off-the-shelf ORMs.
Can you elaborate? Meanwhile most tables do have a surrogate key.
Only those that serve as association tables and aren't referenced
themselves (and only very few association tables are referenced by
foreign key) do not (they still have a unique key constraint though).
Just to make sure - you're looking at the CVS check-out version, not
at 0.1 or something?
>
> Third, how does one go about submitting proposals for schema
> extensions?
> I am wanting to extend the schema with a few modules, mainly ripped
> out
> of either GUS and/or chado, as well as adding a module for
> proteomics data.
You would send those to the list, ideally accompanied with some
comments on motivation and why the existing tables can't deal with
the data the new entities are supposed to capture. That would give
people a chance to comment.
I enthusiastically welcome proposals for additions especially if
those help to promote the utility of BioSQL.
>
> Fourth, is the current practice for representation of biological
> pathways and interactions to use the bioentryrelationship table?
Yes, that was my plan when I worked on the Symgene project. I didn't
get to ever implement that though so don't know how well it would
really work.
I did implement bioentry graphs with the bioentry_relationship table,
and I had to add an evidence table to accomplish my goals. With that
it worked very well though.
This is the evidence table, I'll add it in the 1.1 version.
CREATE TABLE Evidence (
Evidence_Id INTEGER NOT NULL,
Score NUMBER NULL,
Last_Modified DATE DEFAULT SYSDATE NOT NULL,
Bioentry_Relationship_Id INTEGER NOT NULL,
Term_Id INTEGER NOT NULL,
DBXref_Id INTEGER NULL,
PRIMARY KEY (Evidence_Id)
UNIQUE (Bioentry_Relationship_Id, Term_Id, DBXref_Id)
);
>
> Many thanks.
You're most welcome.
-hilmar
>
> --
> Angel Pizarro
> Director, Bioinformatics Facility
> Institute for Translational Medicine and Therapeutics
> University of Pennsylvania
> 806 BRB II/III
> 421 Curie Blvd.
> Philadelphia, PA 19104-6160
>
> P: 215-573-3736
> F: 215-573-9004
> E: angel at mail.med.upenn.edu
>
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the BioSQL-l
mailing list