[Bioperl-pipeline] Pipeline has been saving my time.

Elia Stupka elia at tll.org.sg
Tue Mar 18 15:28:14 EST 2003


Dear Sang Chul,

great to see you are using BioPipe! comments below...


>       The system would get BAC clone data from Database,  repeatmask 
> and then blast them. In addition to that, I need a analyzer for 
> locating the clone against human genome which comes
> from NCBI's CONTIG, and contig map(seq_contig.md) data.

Just out of curiosity, do you really need to use their system for this, 
can't you use blast for this as well?

>       So I changed the PipelineManager into a UNIX daemon which 
> processes jobs created by the command from the above scientist.

Sounds interesting, we also want to do the same, could you let us know 
more about how you achieved that?


> ===================================================================
>       There is one problem unsolved yet. I create two jobs for both
> BAC end sequences. I should like to append my locating analysis
> just after the two jobs were completed, not after only one job.
> How do I do that?  To this end, I created a runnable object derived
> from Bio::Pipeline::RunnableI. Hmmm... I'm sorry that this my capacity.
> I will try it until my work's complete.
> ===================================================================

If I understand you correctly, you would like to run repeatmasker, then 
blast, then the "locating analysis" and would like to make sure that 
the locating analysis happens after both are done. All you need is to 
make sure that blast depends on repeatmasker, and the locating analysis 
happens after blast, and that should take care of it, right?

> Another question: I'm sorry I must ask before I fully understand how 
> PBS (load sharing system) works. But it's a simple one. Must I install 
> BioPipelines into all machines?

No. BioPipe (the database) only needs to be on one machine, and PBS or 
LSF or SGE are used to make sure jobs are distributed across the 
cluster.

> Is that OK if I install Biopipeline into one machine and PBS into the 
> others?

You would have PBS on *all* machines you want to use to run jobs and 
the PBS daemon on your head node. BioPipe would then use the PBS 
interface to send jobs and retrieve results.

Elia

---
Bioinformatics Program Manager
Temasek Life Sciences Laboratory
1, Research Link
Singapore 117604
Tel. +65 6874 4945
Fax. +65 6872 7007



More information about the bioperl-pipeline mailing list