[Bioperl-pipeline] job dependence

Sun Mar 16 10:26:00 EST 2003

> Hi Jeremy,

 From your comments, I would to ask others in the list whether they 
think that setting new jobs
based on the return_ids of completed jobs should continue to be 
supported. My feeling is that eventually
InputCreates provide more flexibility ultimately.

shawn

> I'm marking my changes with ->.
>
> There appear to be two primary places jobs are queued in the
> PipelineManager.pl script. The first looks like:
>
>     foreach my $job(@incomplete_jobs){
>
>         #check whether output of job needed for downstream analysis
>         my $job_depend = 
> $ruleAdaptor->check_dependency_by_job($job, at rules);
>         $job->dependency($job_depend);
>
> The second loop occurs after this and basically sets up new jobs based 
> on
> completed jobs. It looks like:
>
>     foreach my $job (@completed_jobs) {
>       my ($new_jobs) = &create_new_job($job);
>       if(scalar(@{$new_jobs})){
>         print STDERR "Creating ".scalar(@{$new_jobs})." jobs\n";
>       }
>       foreach my $new_job (@{$new_jobs}){
> ->      my $job_depend =
> $ruleAdaptor->check_dependency_by_job($new_job, at rules);
> ->      $new_job->dependency($job_depend);
>

> Above, you can see that I simply copied the dependency check from the
> first section into the appropriate place in the second section. I 
> checked
> the Bio::Pipeline::Manager module and these sections are still 
> basically
> the same.
>
> In runner.pl, I also added a variable for the rule adaptor so that I 
> could
> access the check_dependency_by_job function. The changes in runner.pl 
> look
> like:
>
> sub create_new_job{
>     my ($job_id) = @_;
>     my $job = $job_adaptor->fetch_by_dbID($job_id);
> ->  my $ruleAdaptor = $job_adaptor->db->get_RuleAdaptor;
> ->  my @rules       = $ruleAdaptor->fetch_all;
>     my $action = _get_action_by_next_anal($job, at rules);
>     if ($action eq 'WAITFORALL_AND_UPDATE'){
>       my @inputs = $job->inputs;
>       $job->flush_inputs();
>       $job->add_input(\@inputs);
>     }
>
>     # also need to check if next job depends on output from this job
> ->  my $job_depend = 
> $ruleAdaptor->check_dependency_by_job($job, at rules);
> ->  $job->dependency($job_depend);
>
>     return $job;
> }
>
>
>> The popular way now to populate the job and input tables ( among the
>> few of us using biopipe here) is to use InputCreate which is a little
>> job that reside between
>> analysis that has an IOHandler that fetches input ids and creates jobs
>> for the next analysis. This is a more flexible scheme in which embed
>> more specific
>> code that knows how to pair up inputs for a given job.
>
> I'll take a look and maybe post some questions to the mailing list. 
> Thanks.
>
> Jeremy
>
>