[BioRuby] Bio::Faster plugin
Raoul Bonnal
bonnal at ingm.org
Wed Jan 4 15:05:00 UTC 2012
Hi Francesco,
It's very cool!
And you can access to the seq object/array also in this way:
Bio::Faster.parse(File.join(TEST_DATA,"sample.fastq")) do |id, comments,
sequence, quality|
puts "#{id} #{comments} #{sequence} #{quality}"
end
Obviously I like it more than using the raw array :-)
I suppose in case of no quality value you get a nil object
+1
On 04/01/12 10.50, "Francesco Strozzi" <francesco.strozzi at gmail.com> wrote:
> Hi guys,
>
> I have created a BioRuby plugin called bio-faster, that implements a fast
> and simple parser for FastA and FastQ files. It's based on the C library
> Kseq written by Heng Li (author of Samtools and BWA). Compared to
> Bio::FastQ it is actually 4-5 times faster in parsing large FastQ files.
> The code will not create a Bio object for each sequence but it will return
> a simple array with sequence data and quality values for FastQ (it supports
> Sanger/Phred format only).
> Bio::Faster could be a good choice when you just need to parse huge files,
> for example to extract information or to store sequence data in a database,
> and you don't need to create an object for each sequence but you only want
> to parse the dataset easily and quickly.
>
> Here is the code: https://github.com/fstrozzi/bioruby-faster
> Here is the wiki for more details:
> https://github.com/fstrozzi/bioruby-faster/wiki
> To get the gem: gem install bio-faster
>
> Tested with Ruby 1.9 only.
>
> Any comment or feedback is much appreciated!
>
> Cheers
More information about the BioRuby
mailing list