RDF Schema (was: [Biocorba-l] Annotations)

Thomas Down td2@sanger.ac.uk
Wed, 6 Jun 2001 11:51:54 +0100


On Wed, Jun 06, 2001 at 10:46:48AM +0100, Juha Muilu wrote:
> Thomas Down wrote:
> > 
> > On Tue, Jun 05, 2001 at 10:35:08AM +0100, Martin Senger wrote:
> > > >
> > > > > last couple of weeks, I've been looking at RDF Schema and DAML as mechanisms
> > > > > which could be used to define annotation schemas.  This ought to retrofit
> > > > > quite cleanly on top of the current Annotatable/Annotation API.
> > > >
> 
> Thanks Thomas for the taking this up! Any ideas how this retrofitting
> can be done?
> 
> By quickly looking, the annotation mechanism in the RDF schema seems to
> be much richer what we have.  It contains more information about the
> annotation itself (creator, date, link to related info, type of
> annotation, content of annotation, reference between annotation and
> annotatable and context )

My vision of the schema mechanism is just to use it to
define the types of annotations that you might expect to
find associated with a given type of bio-entity (sequence,
feature, whatever).  Potentially (and this is the really
exciting bit, in some ways) we can also use the same kinds
of mechanism to link into ontologies which define the
relationships between the entity and it's annotations
(and hence make it easier to interwork and query annotated
data from different sources).

The basic RDF mechanism is a set of triples looking like:


    resource_a    --------------------->   resource_b
                      predicate

You can look at this relationship as definining some kind
of `property' of resource_a.

This model is compatible with BioJava Annotations, and the
AnnotationHolder stuff in your IDL.  The annotated entity
(Annotatable, to use the BioJava terminology) is resource_a,
the annotation key is the predicate, and the annotation
value is resource_b (or perhaps a pointer to resource_b -- 
a URI, for instance).

RDF Schema and DAML (I need to re-read some specifications
to get the division between these two clearer -- I'm more
familiar with DAML, which builds upon and extends RDF Schema)
give mechanisms to define the set of properties you would
expect a given type of resource to have.

Retrofitting into current interfaces:

My point about this was mainly that the current interfaces
model data that can be looked at in the RDF model, and can
hence be described using and RDF schema.  The way I see things
going:

  - Simple object model to describe the schemas.
  
  - Add one extra property on Annotation (or perhaps Annotatable)
    which links into the schema model.

> How does all this relate to the DAS ?

[Maybe not quite the target topic for this list, but it's an
interesting question, and I think relevant, so...]


Currently, the types of data that can be communicated in DAS
are rather restricted, and are defined entirely by the DAS
specification.  Basically just `coloured-box' features and
links out to web pages.

I'm looking to see a richer DAS.  We're currently looking at
the XFF feature table format:

  http://www.biojava.org/thomasd/XFF/

as a cornerstone for the next generation of DAS.  If you look
at this, it has `strongly typed' core feature data structures.
To these, you can add `detail' elements (weakly bound data).
Once again, the relationship between a feature and it's details
can be looked at as an RDF triple.  This means we can link
schemas in, just as above.  Which could become particularly
important when we start building clients with GUI interfaces
for querying the DAS.

In the DAS/1.9xx there's also an extra concept of annotations
services, which offer extra information about a given feature
(e.g. if there is a features service offering a set of gene
predictions, somebody else could set up an annotations service
which adds links from the `gene' features to PDB structures of
gene products [where known].  Or expression profiles.  Or functional
annotation.  Or anything, really).  The current plan is that
elements from an annotations service are treated equivalently
to details elements embedded into the original feature table.


[This is all quite new at the moment, and really needs some testing
and experimentation.  I've got a design for
a prototype implementation, and will hopefully get it implemented
over the next couple of weeks.]

Anyone else interested in this?

  Thomas.