[BioRuby] bio-relevant ruby-one-liners
Yannick Wurm
yannick.wurm at unil.ch
Mon Aug 22 07:16:22 UTC 2011
Hi List,
More and more, I find myself using ruby one-liners to replace grep, sed, etc. Here's one I use to subset sequences from 454 based on length information in the header line:
ruby -ne 'BEGIN{$/="\n>"}; length = $_.match(/length=(\d+)/)[1].to_i; print $_ if (300..320).include?(length) ' < my.fasta > subset.fasta
I'm sure there's more elgant ways of generalizing & fool-proofing this.
Whats your favorite bio-relevant ruby oneliner?
Cheers,
yannick
FYI, the 454 read input FASTA file looks like this:
>F07XJJT02F00DS length=421 xy=2354_2974 region=2 run=R_2009_08_22_19_24_45_
TCTCTCAGTGGTCAGGACTCTGTTAACTTACTGCCTGACTCGATTGATTTGGAGATCAGG
[...]
G
>F07XJJT02F00E3 length=357 xy=2354_3021 region=2 run=R_2009_08_22_19_24_45_
TTTTTATTTTTTTTTTTTACTTTGTACAGCTTTATTAAGATCTAATAAAAATAGATTACA
[...]
CC
>F07XJJT02F00EH length=490 xy=2354_2999 region=2 run=R_2009_08_22_19_24_45_
TCTAAATGTGTTATTAATTATTTTCAACTATTTATAACTATGTATAGTATTTACAATATT
[...]
TACTCAATTT
>F07XJJT02F00ET length=93 xy=2354_3011 region=2 run=R_2009_08_22_19_24_45_
GATAGGCAGGGTTGTGCCATATATTTTAAGATTACGTCTATACCAGTTTTTACGTAAACA
ATACGTGATGTTTANATGTAATGTAACGAATGT
>F07XJJT02F00EV length=445 xy=2354_3013 region=2 run=R_2009_08_22_19_24_45_
TATTTTAATTGATATTATAATTTGTGTTGTATATATTTTTGCTTGTATCTTATAAATTAA
[...]
AATATAAAACAATTCAAAATAAATAAGCAAATTATTTACTTAAAA
-----------------------------
Ant Genomes & Evolution
http://yannick.poulet.org
skype://yannickwurm
-----------------------------
BLAST @ http://antgenomes.org
More information about the BioRuby
mailing list