[Bioperl-pipeline] job dependence

Sun Mar 16 11:56:58 EST 2003

> From your comments, I would to ask others in the list whether they 
> think that setting new jobs
> based on the return_ids of completed jobs should continue to be 
> supported. My feeling is that eventually InputCreates provide more 
> flexibility ultimately.

I guess InputCreates are the way to go, and we could just have a 
DefaultInputCreate which emulates the former.

Elia

>
>
> shawn
>
>
>
>> I'm marking my changes with ->.
>>
>> There appear to be two primary places jobs are queued in the
>> PipelineManager.pl script. The first looks like:
>>
>>     foreach my $job(@incomplete_jobs){
>>
>>         #check whether output of job needed for downstream analysis
>>         my $job_depend = 
>> $ruleAdaptor->check_dependency_by_job($job, at rules);
>>         $job->dependency($job_depend);
>>
>> The second loop occurs after this and basically sets up new jobs 
>> based on
>> completed jobs. It looks like:
>>
>>     foreach my $job (@completed_jobs) {
>>       my ($new_jobs) = &create_new_job($job);
>>       if(scalar(@{$new_jobs})){
>>         print STDERR "Creating ".scalar(@{$new_jobs})." jobs\n";
>>       }
>>       foreach my $new_job (@{$new_jobs}){
>> ->      my $job_depend =
>> $ruleAdaptor->check_dependency_by_job($new_job, at rules);
>> ->      $new_job->dependency($job_depend);
>>
>
>> Above, you can see that I simply copied the dependency check from the
>> first section into the appropriate place in the second section. I 
>> checked
>> the Bio::Pipeline::Manager module and these sections are still 
>> basically
>> the same.
>>
>> In runner.pl, I also added a variable for the rule adaptor so that I 
>> could
>> access the check_dependency_by_job function. The changes in runner.pl 
>> look
>> like:
>>
>> sub create_new_job{
>>     my ($job_id) = @_;
>>     my $job = $job_adaptor->fetch_by_dbID($job_id);
>> ->  my $ruleAdaptor = $job_adaptor->db->get_RuleAdaptor;
>> ->  my @rules       = $ruleAdaptor->fetch_all;
>>     my $action = _get_action_by_next_anal($job, at rules);
>>     if ($action eq 'WAITFORALL_AND_UPDATE'){
>>       my @inputs = $job->inputs;
>>       $job->flush_inputs();
>>       $job->add_input(\@inputs);
>>     }
>>
>>     # also need to check if next job depends on output from this job
>> ->  my $job_depend = 
>> $ruleAdaptor->check_dependency_by_job($job, at rules);
>> ->  $job->dependency($job_depend);
>>
>>     return $job;
>> }
>>
>>
>>> The popular way now to populate the job and input tables ( among the
>>> few of us using biopipe here) is to use InputCreate which is a little
>>> job that reside between
>>> analysis that has an IOHandler that fetches input ids and creates 
>>> jobs
>>> for the next analysis. This is a more flexible scheme in which embed
>>> more specific
>>> code that knows how to pair up inputs for a given job.
>>
>> I'll take a look and maybe post some questions to the mailing list. 
>> Thanks.
>>
>> Jeremy
>>
>>
>
> _______________________________________________
> bioperl-pipeline mailing list
> bioperl-pipeline at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-pipeline