[Biocorba-l] biocorba and the OMG spec

Juha Muilu muilu@ebi.ac.uk
Fri, 01 Sep 2000 14:19:39 +0000


Hi All,

Thanks Jason and Brad for the emails. It would be really nice to make
the BSA and Biocorba specs more compatible at least so that mapping
becomes easy and unique. 

Unfortunately it may not be possible to do any major changes into the
BSA spec anymore. Extensions are of course possible.  (Can somebody
who knows the OMG procedures comment on this?).

Ideally it would be nice if the interfaces used in the
Biocorba and BSA could inherit from common general base classes where
more specialized functionality is needed, like extensions for the
GNOME bonoboo and life cycle support.  However, the handling of the 
value type objects in a uniform manner may be tricky. 
Any idea do we get OBV for perl ever?

By quickly looking I found following possible mapping problems in the specs: 

- First is the use of accession numbers and identifiers:

  ID string in the BSA uses syntax convention specified in the CORBA naming
  services (CosNaming::StringName). Basically the ID contains
  "/"-separated components (starting from the data source etc. to the
  accession number) so that it becomes unique. Version is indicated by
  the "kind"-part which is marked as "." at the end of the ID.
  The spec is rather loose and may not be possible to map to
  Biocorba ids.

  Also the ID can contain only chars: 'a-z', '0-9', '$', '-', '_' and '.'
  '.' is discouraged and must be escaped. Hope the Biocorba uses these also

- BioSequence and Annotation objects contains also information about 
  the basis of an object. I.e is it "computational" (like consensus sequence) or
  "experimental" (also both or "not known"). This is missing in biocorba


- SeqFeatures may not be possible to map at all. First BSA uses name
  value pairs as specified in CORBA property services, where the 
  value is type of CORBA Any. Biocorba uses string lists. 

  Interpretation of long name value lists is also difficult and in the BSA spec
  qualifiers are in a separate field. I think the purpose is
  to provide proper hiding place for the EMBL/GenBank feature table info (
  I may be wrong)

  In the BSA there is only one way relationship between SeqFeature and 
  BioSequence. (From seq to feature). Wrong or right ?

  Strand types also includes, in addition to PLUS and MINUS, BOTH (!),
  NOT KNOWN and NOT APPLICABLE 

  Sequence region can be also recursive composite region 

- In BSA SeqFeature (actually SeqAnnotation) is subclass of Annotation. 
  Annotation applies to whole sequence and SeqAnnotation to certain 
  region. The Annotation is perhaps not clearly specified in Biocorba


OK this list may not complete... but hope it initiates more discussion

Thanks very much for setting up the mailing list and the biocorba-web site.
Good work !! 

Please have a look at our work at http://industry.ebi.ac.uk/openBSA
We have the analysis server already working and some other things are more
or less alpha stage. Especially documentation, ...but we are working on it 

Best regards, Juha


-- 
 +--------------------------------------------------------------------+
 |Juha Muilu, Ph.D., EMBL Outstation| Email:  muilu@ebi.ac.uk         |
 |European Bioinformatics Institute | Phone:  +44 (0)1223 494 624     |
 |Wellcome Trust Genome Campus      | Fax:    +44 (0)1223 494 468     |
 |Hinxton, Cambridge CB10 1SD, UK   | http://industry.ebi.ac.uk/~muilu|
 +--------------------------------------------------------------------+