[Bioperl-l] computation object

Tue, 05 Dec 2000 18:01:40 +0100

Hi

I'm rather new to this, so if I say strange things, or seem to behave in a
improper way, please let me know. I like the bioperl/bioxml iniative a lot
and hope I can contribute.

My goal is to us bioperl/bioxml in a bigger database system to communicate
with a diverse set of tools. I've discussed this with Bradley Marshal, and
we seem to agree that although the seqFeature object can store most of the
data needed, it would be nice to extend it so that it is more suited to
holding the results of a computation and make import and export (to game)
easier.

The new computation object would contain

* computation_id (for game)

* A way of storing several set's of sub_seqfeature's, much like the current
'tags' system, but returning an array of seqfeatures. This is to make it
easier to parse and separate subsets of sub_seqfeature's. The sub_seqfeature
method could stay intact and just return all sets.
Advantages to this structure would be that if somebody inherits from this
object and stores seqfeature's of children of seqfeature's in this
structure, it would still be parsable without the parsing having to know
exactly what subset's are there.

* A set of specific tags more geared toward a computation
 - computation_date
 - program_name, program_version, program_date, program_url
 - database_name, database_version, database_date, database_url

* score related data
this would look like the tags structure, but would be dedicated to storing
score's. It might be a good idea to have a small score object which can also
store the range of value's which the score can have, but that might be over
the top.

I am aware that this is slightly in conflict with the way the genscan module
works now with the Gene object, but I see an advantage to a general way of
handling data like this. If we choose to take this path, it would not be an
enormous problem to have this object inherit from computation, I think.

I would have time to write thinks like this, by the way.

Tell me if it is a good idea or if not, how to store the results of a
diverse set of computation results in a consistent way.

Mark Fiers