[Bioperl-l] Re: Pipeline Input/Output refactoring plan

Elia Stupka elia@fugu-sg.org
Wed, 13 Mar 2002 23:06:40 +0800 (SGT)


> We really like the IOAdaptor thing.  I'm not convinced about storing it in
> the database like that though.  We were thinking more of having it in the
> analysisprocess table (one input_adaptor column and one output_adaptor
> column).  Presumably you want to run the same analysis on differently
> shaped inputs which may come from different databases is that right?

(will try to make sense, I am falling asleep)

Ah, makes sense actually, I had just started a reply, but now I see what
you mean and it makes total absolute sense, doh!

> Now this has us foxed as we (well me especially) don't really understand
> the biodb stuff.  Can you explain the reasoning behind this.

Ah, this is just to have maximum flexibility, in many cases some of the
columns could be null (redundant), but a case when they are all full would
be:

db_locator: hostuserblablabla-comparadb
dbadaptor_module: Bio::EnsEMBL::Compara:DBAdaptor
biodbadaptor_module: Bio::EnsEMBL::Compara:GenomeDB
biodbname: homo_sapiens_3_26-protein
IO_adaptor: Bio::EnsEMBL::Compara::DnaFragAdaptor
IO_adaptor_method: "fetch_by_name"

the same could apply if fetching stuff from bioperl-db, does it make more
sense now?

> >The idea is not to leave anything hard coded in the runnable about how it
> >should fetch its input and write its output.
> 
> Agreed - you mean the RunnableDB yes?

Yes.

> I was thinking more of one adaptor per input type i.e. there is no choice
> of methods.  The number of input types we have is very small.

Nah... here is where we would really like the new design to be radically
different. You could be having as input a Family object and then run an
aligner runnable on it, or even more weird stuff. This ties in with making
it flexible enough for non-genome-wide applications, design tools,etc.

> I'm worried about having to set up even more stuff in a database.  People
> have enough trouble loading up an analysisprocess table as it is.  I would
> like people to take a read-only ensembl/biosql database build a runnableDB
> and point it at that database.  Actually we're ok here thinking about it -
> apart from the IO_Adaptor table which I don't understand.

I was thinking about that, and I think we could provide conf files or
built-in types so that tables could get filled, so we have flexibility and
user-friendliness at the same time. Also, for user-friendlyness type stuff
we are developing the java pipeline design stuff which would work fine as
long as we have this flexible system.

> We have GeneAdaptors and FeatureAdaptors and PredictionAdaptors already
> and these can be reused.

Ah yah, what I meant was that if we want to store genes in bioperl rather
than ensembl we need to do something about it, but it's no big deal.

> There are strong noises for branching from some corners here :-)  

That brings me to the other mail... bioperl or ensembl...

Elia

-- 
********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 874 1467         *
* mobile: +65 90307613         *
* fax:    +65 777 0402         *
********************************