[Bioperl-l] 1.01

Catherine Letondal letondal@pasteur.fr
Tue, 14 May 2002 10:33:22 +0200


Jason Stajich wrote:
> > 2) general execution parameters:
> >
> >  a) local or remote execution
> > 	- default could be local for EMBOSS and remote for Pise?
> > 	- in Pise, the default remote server could be different for different programs (I
> > 	mean, not only at Pasteur...:-) )
> >
> >     - so one should be able to choose between local/remote execution and, if remote, to
> >    choose a non-default server location; this choices could happen either at
> >    factory creation, or at application creation, or at run step:
> > 	# a) at factory creation
> > 	$factory = new Bio::Factory::Pise(-remote => 'http://somewhere/cgi-bin/Pise');
> >
> > 	# b) at application creation - take the default remote server
> > 	$needle = $factory->program('needle', -remote => 1);
> >
> > 	# c) at run time
> > 	$result = $mfold->run(-remote => 'http://bioweb.pasteur.fr/cgi-bin/seqanal/mfold.pl');
> >
> Exactly!
> We may want to fold this into some of the new Bio::Root::HTTPget so one
> can use proxies for  those behind firewalls.

Is this module able to issue a multipart form-data POST?

> > 3) parameters specification
> >
> >    a) when?
> > 	# at factory creation?
> > 	$water = $factory->program('water', sequencea => $seqa,  seqall => $seqb);
> > 	$result1 = $water->run();
> >
> > 	# before running?
> > 	$water->sequencea($seqc);
> > 	$result2 = $water->run();
> >
> > 	# when running?
> > 	$result3 = $water->run(sequencea => $seqd);
> >
> as part of running I guess - but one might want to be able to set some
> parameters in the factory objects like
> $factory->db('est');
> foreach my $seq ( @seqs ) {
> 	$result2->run(-sequencea => $seq);
> }
> So setting it in the factory would account for a default behavior in
> calls? Or is this making it too complicated?

No, this is the idea.

> 
> >    b) how?  -name or name
> 
> I prefer -name and this is how emboss cmdline opts look so that's my vote,
> but happy to be swayed with a good argument against.
> 
> >
> >
> > 4) analysis results: what is it, a string, an object, ...?
> >
> >   $result = $fasta->run();
> >
> > 	- in Pise/bioperl $result is an instance of PiseJob, i.e a kind of "han
      dle" from
> >         which you can fetch results (image files, treefile, ...)
> > 	print $result->content("treefile");
> > 	print $result->stdout;
> > 	$result->save("blast2.txt");
> > 	etc...
> >
> > 	- in Bio::Tools::Run::EMBOSSApplication, it's a string (the actual resu
      lt): don't
> > 	you think it's more general to have an object?
> >
> 
> I guess a string is best - initially, we can always try and define an
> appropriate object later?  For EMBOSS apps there are supposed to be a
> finite set of report formats so we could probably code up an
> EMBOSSReportReader, but not sure how useful that will be to people.

Just a string is not possible in Pise. There many programs that return several
files as results, so we need an intermediary object to get them, say a "job" object, from a
class Bio::Tools::Run::PiseJob (similar to the PiseJob class of Pise/bioperl).

Ideally, this object should work also as a string, but not only. For instance:

$result = $neighbor->run();
print $result;				# prints stdout (i.e, main analysis result for
					# EMBOSS programs)
print $result->content("treefile");	# prints the treefile file's content

Is that possible in perl? (I'm thinking of Python __str__ method which enables
such use).

> 
> > 5) use of analysis result:
> >
> >    - it's convenient to be able to build a handle from a result, in order
> >    to feed it to bioperl parsers or to other programs
> >
> >     $aln = Bio::AlignIO->newFh (-fh => $needle_result->fh("outfile.align"),
> >                                 -format => "fasta");
> >     $neighbor = $factory->program('neighbor', infile => $protdist_job->fh('outfile'));
> >
> >    - construct an analysis result from an ID:
> >     $neighbor = $factory->result('http://bioweb.pasteur.fr/seqanal/tmp/blast2/A12465102130064/')
> >
> Yes definitely - If we can wrap HTTPget as a filehandle then it can be
> passed directly to the parser.  Otherwise using LWP you have to download
> the data and either write to a tempfile or wrap the data string with
> IO::String to make it behave like a fh (see DB::WebDBSeqI).

This is the way it works in Pise/bioperl - more exactly, it's through a list.

> 
> > 6) misc:
> >
> >  - It should be possible to issue an asynchronous run request (to enable pa
      rallel
> >    execution for long jobs)
> >
> >
> > How is all that compatible with OpenBSA?
> >
> 
> I think all of this is exactly what I was thinking, we just need to try
> and code up the protype/port from your PISE code into bioperl (assume it
> is okay if we slurp this in?).  Some generalization may be needed for the
> server communication as the Novella connection will be a CORBA not HTTP?
> Or is this just handled in the client code and based on the URL that is
> passed with the -remote flag.
> 

It could be encoded in the "job" class (Bio::Tools::Run::PiseJob), I guess. 
A Bio::Tools::Run::PiseApplication run method would behaves differently from 
a Bio::Tools::Run::EMBOSSApplication or a Bio::Tools::Run::NovellaApplication. 

--
Catherine Letondal -- Pasteur Institute Computing Center