[Biocorba-l] Annotations

Ewan Birney birney@ebi.ac.uk
Mon, 4 Jun 2001 15:11:55 +0100 (BST)


On Mon, 4 Jun 2001, Juha Muilu wrote:

> 
> How people feel about the separation of Annotation holder from the 
> Sequence (Annotatable) ?
> 
> + It can be useful if we later need new get/set methods for the
> annotations and sequence features. The Sequence "line" may become over
> exploited if we start to extend it because of new annotation methods.
> 
> - It is another indirection more. 
> 
> Do we need composite annotations? For those we can have new annotation
> interface which inherits from the Annotation and AnnotationHolder. 
> 
> By quickly looking the GO annotations, for example, can be expressed
> using the composite annotations. Does this work also in practice? In the
> bioPerl mailing list were recently lot of discussion about the GO stuff.
> Did you reached the consensus?

This is something I would like to take on at BOSC. My feeling:

   - Annotation describes the association of comments, literature,
other database references and indeed anything else dreamt up by a
curator associated with "something" definite (often Genes, often
Sequences).

     SeqFeatures *are not* by default Annotations. SeqFeatures need to be
as light as possible as we generate, store, make etc millions of them.



   - we need one level of indirection - but this is possibly already done
by the annotation object. I think composition rather than multiple
inheritance is fine. ie

    seq has-a annotation object which is a rather generic holder of
annotations. I think annotation holder becomes equivalent to annotation.


  - Annotation objects should be very run-time query-able, something like


    @objects = $annotation->get_Annotation('Disease');

    - this is the sort of future extensibility which was kicked around on
Bioperl. Problems:
      
         (a) do we constrain objects at all? Or do we go more like


    @objects = $annotation->get_Annotation_type('Disease','string');

    to allow clients to request types here.

        (b) Just simple "type" queries. Or something richer? (NB - this is
not a seqfeature problem which a separate querying task...)

        (c) naming. get_Annotation on an Annotation object. Sounds v. bad.
   

    Basically we are in a sticky area here. There are probably a number of
basic design patterns around this. Any suggestions?


  - We need an explicit extension of SeqFeature which has-a annotation.

   (or mix-in of SeqFeature and Annotation? Hmmmm)

     - most sequence features (>95%) *do not* have annotations in real
life (believe me, i know) but certain ones have heavy annotation (eg
Genes).

     

Lots of ideas to kick around. We don't do this well in Bioperl, Ensembl,
Biopython or Biojava in my view (Brad/Jason/Matt/Thomas - thoughts?).


e.

> 
> -- 
>  +--------------------------------------------------------------------+
>  |Juha Muilu, Ph.D., EMBL Outstation| Email:  muilu@ebi.ac.uk         |
>  |European Bioinformatics Institute | Phone:  +44 (0)1223 494 624     |
>  |Wellcome Trust Genome Campus      | Fax:    +44 (0)1223 494 468     |
>  |Hinxton, Cambridge CB10 1SD, UK   | http://industry.ebi.ac.uk/~muilu|
>  +--------------------------------------------------------------------+
> _______________________________________________
> Biocorba-l mailing list
> Biocorba-l@biocorba.org
> http://www.biocorba.org/mailman/listinfo/biocorba-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------