[BioRuby] bio-relevant ruby-one-liners

Yannick Wurm yannick.wurm at unil.ch
Mon Aug 22 07:16:22 UTC 2011


Hi List,

More and more, I find myself using ruby one-liners to replace grep, sed, etc. Here's one I use to subset sequences from 454 based on length information in the header line:

ruby -ne 'BEGIN{$/="\n>"}; length = $_.match(/length=(\d+)/)[1].to_i;  print $_ if (300..320).include?(length) ' < my.fasta > subset.fasta

I'm sure there's more elgant ways of generalizing & fool-proofing this. 

Whats your favorite bio-relevant ruby oneliner?

Cheers,
yannick




FYI, the 454 read input FASTA file looks like this:

>F07XJJT02F00DS length=421 xy=2354_2974 region=2 run=R_2009_08_22_19_24_45_
TCTCTCAGTGGTCAGGACTCTGTTAACTTACTGCCTGACTCGATTGATTTGGAGATCAGG
[...]
G
>F07XJJT02F00E3 length=357 xy=2354_3021 region=2 run=R_2009_08_22_19_24_45_
TTTTTATTTTTTTTTTTTACTTTGTACAGCTTTATTAAGATCTAATAAAAATAGATTACA
[...]
CC
>F07XJJT02F00EH length=490 xy=2354_2999 region=2 run=R_2009_08_22_19_24_45_
TCTAAATGTGTTATTAATTATTTTCAACTATTTATAACTATGTATAGTATTTACAATATT
[...]
TACTCAATTT
>F07XJJT02F00ET length=93 xy=2354_3011 region=2 run=R_2009_08_22_19_24_45_
GATAGGCAGGGTTGTGCCATATATTTTAAGATTACGTCTATACCAGTTTTTACGTAAACA
ATACGTGATGTTTANATGTAATGTAACGAATGT
>F07XJJT02F00EV length=445 xy=2354_3013 region=2 run=R_2009_08_22_19_24_45_
TATTTTAATTGATATTATAATTTGTGTTGTATATATTTTTGCTTGTATCTTATAAATTAA
[...]
AATATAAAACAATTCAAAATAAATAAGCAAATTATTTACTTAAAA




-----------------------------
   Ant Genomes & Evolution 
  http://yannick.poulet.org
     skype://yannickwurm
-----------------------------
BLAST @ http://antgenomes.org





More information about the BioRuby mailing list