[Bioperl-pipeline] pipeline connections

Elia Stupka elia@fugu-sg.org
Mon, 23 Sep 2002 14:58:38 +0800 (SGT)


> It won't be trivial to differentiate them. Unless u say that jobs that
> fail in the running stage are 'biological' and those in
> reading/writing are node errors. Or else u gotta parse the error
> files.

Absolutely, didn't say it was trivial. Actually later on, when we can
breathe, I think doing a first pass analysis of error files was always on
the to-do list, and we never got round to do it. Self-diagnosis and
healing, a-la-Eliza-IBM ;)

> And it makes no difference whether u send 10 jobs to the node at once
> or 1 at a time if the node is faulty. They will still end up with 10
> failed jobs sooner or later. 

Totally agree. Just meant the batch could go off to another node, rather
than going back to the database and being picked up, but no big deal.

> batchsubmission object get the exit code from LSF and log it down for
> each node. If it consistently displays a certain error after n-times,
> then log it down as disabled and avoid sending jobs there thereafter.

Yep... though LSF is really there to do most of these things...

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6779 1117        *
********************************