[BioRuby] Improve rake/snakemake/nextflow.io?
Yannick Wurm
y.wurm at qmul.ac.uk
Mon Mar 2 15:43:49 UTC 2015
Hi all,
dumb question that hasn't been asked/discussed here for a while...
What's the easiest way to make a *simple* pipeline?
Two contenders that come up in google:
* snakemake
http://metagenomic-methods-for-microbial-ecologists.readthedocs.org/en/latest/day-1/#merge-paired-end-illumina-data
* nextflow
http://www.nextflow.io/example4.html
This one clearly allows grouping of files (e.g. read_pairs)
Any other rake/make-killers?
Criteria I think are important are:
* simple syntax (yaml?)
* easy wild-carding syntax/DSL
XXX.bam requires #{basename($_)}.sam
* easy grouping of files (for paired reads; for samples split across multiple files)
* easy error checking & failing
- e.g. checking that output files are not empty
- e.g. checking that files have same length (when appropriate)
- e.g. checking return code or presence/absence of specific text in stdout or stderr
The additional killer would be amazing visual progress output & if it learnt how long specific times are likely to take to provide an ETA.
Cheers,
Yannick
-------------------------------------------------------
Yannick Wurm - http://wurmlab.github.io
Ants, Genomes & Evolution ⋅ y.wurm at qmul.ac.uk ⋅ skype:yannickwurm ⋅ +44 207 882 3049
5.03A Fogg ⋅ School of Biological & Chemical Sciences ⋅ Queen Mary, University of London ⋅ Mile End Road ⋅ E1 4NS London ⋅ UK
More information about the BioRuby
mailing list