[Biocorba-l] Biocorba-0.2.0 - A proposal

Alan Robinson alan@ebi.ac.uk
Thu, 9 Nov 2000 00:43:07 +0000 (GMT Standard Time)


Hi,

Thanks for this feedback.


> My small comments/concerns are below:
> 
> >  This IDL will
> > work with ORBacus & JavaORB for Java, ORBit for perl.
> 
> *sigh* no mention of python anywhere. How sad for us poor python
> people. If you want to add it, this will work with omniORB, Fnorb and
> ILU for python (although I probably won't mess with ILU unless someone 
> specifically asks me to).

I was being honest. I don't use python and hadn't implemented the IDL with
it, so I didn't include it. If you knock up a client or server in Python,
then this should be added to the list immediately. (There is, of course,
nothing in this IDL that prevents a Python client/server - I just didn't
want to prempt the statement). Perhaps we change the statement to the
effect that the IDL is CORBA2.2 compliant, and hence workable with the
following ORBs,...


> >  5) Seq and SeqFeature now have the concept of sub-SeqFeatures. Thus
> >     the methods to return the SeqFeature objects of a Seq may specify
> >     whether all the SeqFeatures are to be returned, or just the top
> >     ones.
...

> SeqFeatureVector all_SeqFeatures(in boolean sub_seqfeatures);
> 
> What would people think about getting rid of this boolean and always
> returning all Sub-SeqFeatures? At least for me, this is the behavior I
> believe I would want.

Probably someone authorative from the bioperl community should comment on
this. The EMBL database does not have the concept of sub-SeqFeatures.
However, I could imagine it would be useful to be able to step through
different layers of sub_SeqFeatures individually (boolean=false), as well


as getting them all at once (boolean=true). The latter is useful for
understanding the hierarchical relationship between features and their
sub-features.

There's also an error in the IDL - either both (or neither) of the
sub_SeqFeature methods in Seq and SeqFeature interfaces should have the
boolean.


> >   6) All version information (e.g. for PrimarySeq and PrimarySeqDB) is
> >      defined as a 'long' data type, rather than a 'string' data
> >      type. This decision was made on the basis that using a 'string'
> >      to specify the version makes it much less convenient to determine
> >      the most recent version of a database.
> 
> Oh, maybe everyone else isn't using such a nice language to program in 
> :-).

Hmmm. I can do it fairly easy with strings in Java for dates and numbers,
though not as easily as for python and perl.

One to discuss...


> Seriously, I am really for using strings over longs, even though it is 
> less convenient, just because I don't think it makes much sense to
> version local databases with version numbers like 0.1. I would like to 
> version them with dates like '2000-01-03'. At least with python, I can 
> order these just as easily as version numbers, plus it actually makes
> some sense when I'm trying to go back later and figure out what file
> databases need updating or when a file database is from.

Ewan, Matt and I did discuss this very issue - we'd thought about
suggesting unix timestamps, or alternatively, '2000-01-03' becomes
'20000103'

A potential backdoor solution is that each SeqDB object has a 'string'
description field (which would be kind of useful anyway). This could hold
your stringified date.

I am concerned that with a string version, people may misuse the version.

We should discuss further.


> 7) Can we move the memory counting stuff to an interface inside of
> org.biocorba.seqcore? I think it is good to acknowledge that we got
> things from Gnome work, but it also makes more sense to me to have
> everthing inside the same interface, and then just have a little
> acknowledgement to the Gnome folks. I would also like to drop the
> query_interface() function, since we don't use it for anything. What
> do people think about this?

I don't bother to implement this, as the EMBL server already has an object
evictor implemented. In terms of re-use, the current approach is a good
thing. However, you are right about the mismatch with the
query_interface() method.


cheers,
alan.