[BioRuby] Ruby speed
Yannick Wurm
yannick.wurm at unil.ch
Tue Nov 3 22:49:12 UTC 2009
Hi Mike,
thanks for your response. I'm running:
ruby 1.8.6 (2008-03-03 patchlevel 114) [x86_64-linux]
Starting to age, but on a production machine I'd rather stay with what
works than risk breaking things by upgrading them.
the command sed 's/^>/>MyPrefix/' is indeed 30% faster than perl :)
My reasons for preferring ruby are the same as yours. But a 5 to 10x
speed difference is expensive (I'm calling the one-liner below about
10,000 times from a larger ruby script - YES, it's ugly, but
refactoring the script to avoid calling that type of oneliner would be
a pain since I use 10,000 different prefixes).
I have the feeling that it's ruby's startup-time especially. Running
the ruby one-liner my a fasta of 40,000 sequences takes 20 seconds;
running it a fasta of only 10 lines still takes 13 seconds!!
I found some generic benchmarks indicating that ruby is generally only
a bit slower than perl
http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=ruby&lang2=perl
So maybe I can keep using ruby - just avoiding one-liners!
Best,
yannick
On 3 Nov 2009, at 22:26, Michael Barton wrote:
> What version of Ruby are you using?
> Ruby is an expressive language rather than a "fast" language.
> I use Ruby because it's easer to read and maintain my programs, rather
> than because how fast it is.
>
> If you are interested purely in speed you could write in C?
> What are the benchmarks for something like this?
>
> time sed 's/^>/>MyPrefix.' clustering/dirsForAssembly/singlets.fasta
> > abc
>
> Mike
>
> 2009/11/3 Yannick Wurm <yannick.wurm at unil.ch>:
>> Hi,
>>
>> this is a more general ruby question, but since my application is
>> bioinformatics, I'm posting it here.
>>
>> Just wanted to prepend a few characters in front of FASTA
>> identifiers.
>>
>>
>> $time cat clustering/dirsForAssembly/singlets.fasta | ruby -pe
>> "gsub(/^>/,
>> '>MyPrefix')" > abc
>> real 0m20.379s
>> user 0m0.741s
>> sys 0m0.168s
>>
>>
>> While the perl equivalent is one heck of a lot faster!!!
>>
>>
>> $time cat clustering/dirsForAssembly/singlets.fasta | perl -p -i -e
>> 's/^>/>MyPrefix/g' > ab
>> real 0m2.165s
>> user 0m0.266s
>> sys 0m0.146s
>>
>>
>> Is there any hope for ruby?
>>
>> Thanks,
>> yannick
>>
>>
>> --------------------------------------------
>> yannick . wurm @ unil . ch
>> Ant Genomics, Ecology & Evolution @ Lausanne
>> http://www.unil.ch/dee/page28685_fr.html
>>
>>
>> _______________________________________________
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
More information about the BioRuby
mailing list