[Bioperl-l] Re: [Biojava-l] BioInformatics toolbox.

Alex Rolfe arolfe@genome.wi.mit.edu
10 Apr 2002 16:16:24 -0400


A number of people have pointed out that several GUI's exist for
connecting components into pipelines (and I'll add my own- the
biojava-lims code that I've been working on) and that the existing
bio{perl,java} classes could probably be extended or wrapped to fit into
these frameworks.  But I don't think that the combination would yield a
viable system to let users create their own programs.

When you extend bio{perl.java} classes to get components for these
GUI's, you'd end up with 2 types : data and actions.  Data components
(like java beans) would be able to describe their properties.  Action
components would need to describe the format/requirements for their
inputs and outputs.  Action components would get their inputs from 
- the outputs of other action components
- user inputs
- parameters you specify for the program.

The first problem is how to format the outputs of one action as inputs
for another.  The bio*'s solve this by providing standard interfaces
that everything uses.

The second, and I think harder, problem is that you end up with too many
types of data objects running around and too many types of actions.

Consider a pipeline where you start with genbank accession numbers,
fetch the sequences, blast the sequences against a local database, and
do something with the sequences based on the output.  The first part is
easy to specify.  Input is a list of strings, output is a Sequence
object.  The sequence object goes to the blast component.  But having a
GUI specify how to process the blast output is hard because there are
lots of possibilities.  Trying to specify what should happen through a
GUI seems like it would either be very confusing (eg a long list of
options) or very limiting (a short list of options).  

I think the best way to start on a toolbox that user's could use is to
build a toolbox for programmer's the provides useful components and a
GUI.  Hopefully you have to write less and less code as time goes by to
the point where users could design their own process without any coding.

Alex