[Biojava-l] BioInformatics toolbox.
Alex Rolfe
arolfe@genome.wi.mit.edu
10 Apr 2002 16:16:24 -0400
A number of people have pointed out that several GUI's exist for
connecting components into pipelines (and I'll add my own- the
biojava-lims code that I've been working on) and that the existing
bio{perl,java} classes could probably be extended or wrapped to fit into
these frameworks. But I don't think that the combination would yield a
viable system to let users create their own programs.
When you extend bio{perl.java} classes to get components for these
GUI's, you'd end up with 2 types : data and actions. Data components
(like java beans) would be able to describe their properties. Action
components would need to describe the format/requirements for their
inputs and outputs. Action components would get their inputs from
- the outputs of other action components
- user inputs
- parameters you specify for the program.
The first problem is how to format the outputs of one action as inputs
for another. The bio*'s solve this by providing standard interfaces
that everything uses.
The second, and I think harder, problem is that you end up with too many
types of data objects running around and too many types of actions.
Consider a pipeline where you start with genbank accession numbers,
fetch the sequences, blast the sequences against a local database, and
do something with the sequences based on the output. The first part is
easy to specify. Input is a list of strings, output is a Sequence
object. The sequence object goes to the blast component. But having a
GUI specify how to process the blast output is hard because there are
lots of possibilities. Trying to specify what should happen through a
GUI seems like it would either be very confusing (eg a long list of
options) or very limiting (a short list of options).
I think the best way to start on a toolbox that user's could use is to
build a toolbox for programmer's the provides useful components and a
GUI. Hopefully you have to write less and less code as time goes by to
the point where users could design their own process without any coding.
Alex