[BioRuby] Ruby EMBOSS mapping (using Biolib)
Pjotr Prins
pjotr.public14 at thebird.nl
Thu Nov 26 13:08:30 UTC 2009
Hi all,
The last year I have been working on C library mappings to Ruby. A
comparison of Bioruby against Biolib/EMBOSS six frame translation of a
C.elegans dataset shows the Ruby with EMBOSS version is about 30x
faster. On my (outdated) machine:
Bioruby version:
22929 records 137574 times translated!
real 9m30.952s
user 8m42.877s
sys 0m32.878s
Biolib version:
22929 records 137574 times translated!
real 0m20.306s
user 0m15.997s
sys 0m1.344s
This is including IO - which is handled by Ruby.
The Bioruby code reads:
nt = FastaReader.new(fn)
nt.each { | rec |
seq = Bio::Sequence::NA.new(rec.seq)
[-3,-2,-1,1,2,3].each do | frame |
print "> ",rec.id," ",frame.to_s,"\n"
print seq.translate(frame),"\n"
end
}
$stderr.print nt.size," records ",nt.size*6*iter," times translated!"
The Biolib code reads
nt = FastaReader.new(fn)
trnTable = Biolib::Emboss.ajTrnNewI(1);
nt.each { | rec |
ajpseq = Biolib::Emboss.ajSeqNewNameC(rec.seq,"Test sequence")
[-3,-2,-1,1,2,3].each do | frame |
ajpseqt = Biolib::Emboss.ajTrnSeqOrig(trnTable,ajpseq,frame)
aa = Biolib::Emboss.ajSeqGetSeqCopyC(ajpseqt)
print "> ",rec.id," ",frame.to_s,"\n"
print aa,"\n"
end
}
$stderr.print nt.size," records ",nt.size*6*iter," times translated!"
A write up of the mapping effort is at:
http://biolib.open-bio.org/wiki/Mapping_EMBOSS
More information about the BioRuby
mailing list