[BioRuby] BlastXML parser outputs RDF/JSON etc.
Pjotr Prins
pjotr.public14 at thebird.nl
Sat Sep 6 12:08:36 UTC 2014
Hi all,
One of my oldest gems gets a new life :)
I revamped the bioblastxml parser to produce any type of RDF, JSON,
csv etc. All that needs to be done is write (or use an existing) ERB
template. JSON example:
https://github.com/pjotrp/blastxmlparser/blob/master/template/blast2json.erb
Also the bioblastxml parser makes use of multicore parallelism.
It is probably one of the fastest BLAST XML parser around.
The strategy of a command line interface, lazy parsing, parallelism
throuth the Parallel gem and flexible output with ERB I consider core
strategies for bioinformatics gems. The good news is that it is
surprisingly easy to do!
Have a look at the source code:
https://github.com/pjotrp/blastxmlparser/blob/master/bin/blastxmlparser
and the README
https://github.com/pjotrp/blastxmlparser
When the Parallel gem is not found the parser defaults to single thread.
I'll add these features to the bio-vcf and bio-table gems too in the
near future. With releases to match.
Pj.
More information about the BioRuby
mailing list