[Bioperl-pipeline] job dependence
Elia Stupka
elia at tll.org.sg
Sun Mar 16 11:56:58 EST 2003
> From your comments, I would to ask others in the list whether they
> think that setting new jobs
> based on the return_ids of completed jobs should continue to be
> supported. My feeling is that eventually InputCreates provide more
> flexibility ultimately.
I guess InputCreates are the way to go, and we could just have a
DefaultInputCreate which emulates the former.
Elia
>
>
> shawn
>
>
>
>> I'm marking my changes with ->.
>>
>> There appear to be two primary places jobs are queued in the
>> PipelineManager.pl script. The first looks like:
>>
>> foreach my $job(@incomplete_jobs){
>>
>> #check whether output of job needed for downstream analysis
>> my $job_depend =
>> $ruleAdaptor->check_dependency_by_job($job, at rules);
>> $job->dependency($job_depend);
>>
>> The second loop occurs after this and basically sets up new jobs
>> based on
>> completed jobs. It looks like:
>>
>> foreach my $job (@completed_jobs) {
>> my ($new_jobs) = &create_new_job($job);
>> if(scalar(@{$new_jobs})){
>> print STDERR "Creating ".scalar(@{$new_jobs})." jobs\n";
>> }
>> foreach my $new_job (@{$new_jobs}){
>> -> my $job_depend =
>> $ruleAdaptor->check_dependency_by_job($new_job, at rules);
>> -> $new_job->dependency($job_depend);
>>
>
>> Above, you can see that I simply copied the dependency check from the
>> first section into the appropriate place in the second section. I
>> checked
>> the Bio::Pipeline::Manager module and these sections are still
>> basically
>> the same.
>>
>> In runner.pl, I also added a variable for the rule adaptor so that I
>> could
>> access the check_dependency_by_job function. The changes in runner.pl
>> look
>> like:
>>
>> sub create_new_job{
>> my ($job_id) = @_;
>> my $job = $job_adaptor->fetch_by_dbID($job_id);
>> -> my $ruleAdaptor = $job_adaptor->db->get_RuleAdaptor;
>> -> my @rules = $ruleAdaptor->fetch_all;
>> my $action = _get_action_by_next_anal($job, at rules);
>> if ($action eq 'WAITFORALL_AND_UPDATE'){
>> my @inputs = $job->inputs;
>> $job->flush_inputs();
>> $job->add_input(\@inputs);
>> }
>>
>> # also need to check if next job depends on output from this job
>> -> my $job_depend =
>> $ruleAdaptor->check_dependency_by_job($job, at rules);
>> -> $job->dependency($job_depend);
>>
>> return $job;
>> }
>>
>>
>>> The popular way now to populate the job and input tables ( among the
>>> few of us using biopipe here) is to use InputCreate which is a little
>>> job that reside between
>>> analysis that has an IOHandler that fetches input ids and creates
>>> jobs
>>> for the next analysis. This is a more flexible scheme in which embed
>>> more specific
>>> code that knows how to pair up inputs for a given job.
>>
>> I'll take a look and maybe post some questions to the mailing list.
>> Thanks.
>>
>> Jeremy
>>
>>
>
> _______________________________________________
> bioperl-pipeline mailing list
> bioperl-pipeline at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-pipeline
More information about the bioperl-pipeline
mailing list