[BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer

Ben Woodcroft donttrustben at gmail.com
Fri May 18 01:40:28 EDT 2012


Hi guys,

Here's some blatant advertising for some code I've recently written in
biogem form.

bio-gag: "gag error" is the term I've coined to describe an error that
various people have observed on certain sequencing kits with IonTorrent,
though it has not previously been characterised very well that I know of
(we noticed that the errors seemed to occur at GAG positions in the reads
that were supposed to be GAAG). This biogem tries to find and fix these
errors. It isn't benchmarked for accuracy but worked well enough for my
lab's own purposes. Actually to be honest we've only used an older version
of the software on real data and the logic has a little since given some
recent evidence we have, but I thought I'd push it out with the latest and
greatest error model.
https://github.com/wwood/bioruby-gag

bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
files looking for particular patterns e.g. strand bias of insertions. This
gem can be used to iterate through pileup files one position (one line) at
a time, building up the sequence of each read as it goes, recording their
direction etc. Probably not the fastest piece of code in the world, sorry.
I'm not sure whether this should/can be incorporated into bio-samtools? It
adds functionality - there's no duplication (I don't think).
https://github.com/wwood/bioruby-pileup_iterator

bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
the hmmer v3 manual.
https://github.com/wwood/bioruby-hmmer_model

bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
tabular format files - the guts of this were written by Christian - see
yesterday's thread for details. I'm hoping to add regular (non-tabular)
format parsing in the near future, but no promises.
https://github.com/wwood/bioruby-hmmer3_report

I'm sure there is bugs and deficiencies - apologies in advance.

Enjoy,
ben


More information about the BioRuby mailing list