[Bioperl-l] Pise Factory and Application classes

Catherine Letondal pise@pasteur.fr
Mon, 20 May 2002 14:44:12 +0200


Hi,

I have adapted Pise/Bioperl to the previously discussed remote
submission interface.
ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/Bioperl-5.a.tar.gz
contains the files for the Bio/Factory and Bio/Tools/Run bioperl
directories (it also contains files for generating the .pm program specific 
classes and some other files).

There is a factory class to create application objects:
	my $factory = Bio::Factory::Pise->new(-email => $email);
	my $genscan = $factory->program('genscan',
				-parameter_file => "Arabidopsis.smat"
				);

Parameters can be set at application object creation (as above), or later:
	$genscan->seq($seq);
	my $job = $genscan->run();
or:
	my $job = $genscan->run(-seq => $seq);

Data input for sequence, alignment, etc... (InFile and Sequence parameters)
may be done by Bio::Seq or Bio::SimpleAlign objects, strings, or
just files.

Remote address (CGI url) has default values (for now at Pasteur...), but 
you can also set it:
       1) at factory creation
           my $factory = Bio::Factory::Pise->new(-remote => 'http://somewhere/Pise/cgi-bin',
						 -email => 'me@myhome');
       2) at program creation:
           my $program = $factory->program('water', 
					   -remote => 'http://somewhere/Pise/cgi-bin/water.pl'
					   )
       3) at any time before running:
           $program->remote('http://somewhere/Pise/cgi-bin/water.pl');
           $job = $program->run();

       4) when running:
	   $job = $program->run(-remote => 'http://somewhere/Pise/cgi-bin/water.pl');


The job created by ->run is able to test for termination, error, to return
or save results, to get a filehandle for a result, to pipe programs, etc.. 
(Bio::Tools::Run::PiseJob).

You can create a job from a previously run job by:
	my $job = Bio::Factory::Pise->job(url);
This url may be obtained at job submission by:
	$job->jobid;

You can also submit asynchronously:
	$job = $program->submit();
and wait for termination ($job->terminated). This enables to submit 
on different hosts in parallel for long computation for instance.

We haven't dicussed everything yet about error handling. Sometimes
you get an exception, sometime you have to check yourself for $job->error.
This can be changed, I looked at standalone blast and EMBOSS application
error handling and I don't know what to decide.

I have a problem with multiple selection parameters.
PiseJob class uses LWP::UserAgent to POST the request, and I can't
see how to post multiple selection fields (there is only one in all
Pise (clustalw hgapresidues), so it's not really a problem).

I would be very happy to have your feedback and to get bug reports.

I'm sorry if there is no Makefile.PL etc... I have tried to set this
but ... :-( So, any help would be very much appreciated :-)

Thanks to Leonardo Marino-Ramirez for testing!

-- 
Catherine Letondal -- Pasteur Institute Computing Center