[BioSQL-l] BioSQL and ontology "standards".

Fri Nov 28 18:57:40 UTC 2008

Hi all,

The BioSQL schema allows multiple ontologies, so that things like
entries in seqfeature_qualifier_value can say when they mean by
"locus_tag".

Currently BioPerl and Biopython (and I assume the other projects but
haven't checked) use a couple of ad-hoc ontology names for storing
annotation.  In particular, if there is no predefined entry for a
novel ontology term, it gets added on the fly.  This is very
convenient as it means a BioSQL database can be used without first
importing a predefined ontology.  However there are downsides, for
example spelling errors in the keys of a GenBank file get treated as a
ontology entries.

Have these ad-hoc ontologies ever been defined?  i.e. For table
bioentry_qualifier_value terms, which ad-hoc ontology name should be
used?  Biopython uses ad-hoc ontology named  'SeqFeature Keys',
'SeqFeature Sources', 'Annotation Tags' for various different tables
(which I believe is the same for BioPerl).

On a related point, it might make more sense to use a predefined
ontology, like SOFA or SO from http://www.sequenceontology.org/ where
a novel term is treated as an error (or perhaps falls back on the
ad-hoc ontology).  How do the various Bio* projects cope with
annotations in the database for different or multiple ontologies?  Or
has this not been considered?

Thanks,

Peter