[Bioperl-l] remote blast job queuing

Jason Stajich jason@cgt.mc.duke.edu
Thu, 20 Dec 2001 10:37:41 -0500 (EST)


Steve -

Happy to see your plan here - I sort of imagined the writing of report to
be done with a write_report() method in the Bio::SearchIO implementations
just like we have a write_seq() method in SeqIO.  So blast.pm,blastxml.pm,
fasta.pm etc could provide their own implementation of outputing a data
stream.  Fine for there to be a Writer helper class but I am hoping we can
make bioperl as intuitive as possible and if all our IO systems can
function in similar ways it makes learning the objects much easier.  Just
a thought but feel free to convince me otherwise.

> Another thing that's not in Jason's or my Bio::Search:: stuff is any sort
> of functionality related to running new search jobs. As Ewan mentioned before
> (http://bioperl.org/pipermail/bioperl-l/2001-July/006020.html), it would
> be good to tie into a novella-like system. (I believe Martin mentioned to
> me the possibility of contributing novella to Bioperl. Don't know what the
> current thinking here is).
>
> Steve

I'd like to propose keeping anything that runs a job in the Tools::Run
directory if possible - these would pass back a Bio::SearchIO stream in
much the same way the current running of search tools works in that
directory should work (Run::StandAloneBlast returns a BPlite or
Tools::Blast object).

What I think we need to do is flesh out the Factory::ApplicationFactoryI
that Heikki started for EMBOSS and then provide a specific factory for the
apps we are interested in.  All the current ways one can run blast
Tools::Run::StandAloneBlast and Tools::Run::RemoteBlast can comply with
this interface.  Additionally the implementation of Pise and Novella
clients for remote analysis and EMBOSS clients for local analysis (remote
EMBOSS would be mediated through one of the remote clients likely).

It would make sense (in my mind) to try and have all the implementations
for running apps in common namespace.  My hope is that some of the hard
work in defining the necessary interfaces that has been done with Pise and
BSA/NOVELLA could be used to build the interfaces in bioperl.

So some work to unify these things - but would be great to have an
interchangeable system for analysis programs and compute queues with the
concept of the data object, the parser, and the analysis data stream
all decoupled much the same way we have done for sequence objects,
parsers, and databases.

Maybe I am thinking too big here?

-jason
-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu