[BioRuby] Proposal: Bio::FastaFormat#each_entry
MISHIMA, Hiroyuki
missy at be.to
Fri Jan 29 06:46:15 UTC 2010
Hi all,
How about implementing the following methods?
Bio::FastaFormat#each_entry
Bio::FastaNumericFormat#each_entry
The following is a sample code to generate a FASTQ string from a FASTA
string and a FASTA.QUAL string. This sample may need ruby 1.8.7 or later.
I am afraid that simpler or easier ways are already existed in BioRuby...
Hiro.
-----
#!/usr/local/bin/ruby
require 'rubygems'
require 'bio'
module Bio
class FastaFormat
def each_entry
return to_enum(:each_entry) unless block_given?
@continue = self.dup
loop do
yield @continue
overrun = @continue.entry_overrun
break unless overrun
@continue = Bio::FastaFormat.new(overrun)
end
end
end
class FastaNumericFormat
def each_entry
return to_enum(:each_entry) unless block_given?
@continue = self.dup
loop do
yield @continue
overrun = @continue.entry_overrun
break unless overrun
@continue = Bio::FastaNumericFormat.new(overrun)
end
end
end
end
fasta = <<EOS
>FXQB1I00000001
TATGGAATCTGTAGAATCAGTGGTAGGTGCAGCAGATGGAGGAAGG
>FXQB1I00000002
CTGGAGAATTCTGGATCCTCGACTTATGACTTGGTGGTTCTGGTAACTGTGAGCTTAGGATAGTCAG
EOS
qual = <<EOS
>FXQB1I00000001
30 30 29 42 25 24 5 30 30 30 30 30 28 30 26 9 30 30 30 30 30 42 25 30 30
42 25 29 22 30 29 26 30 30 30 29 30 42 25 30 32 17 40 23 39 24
>FXQB1I00000002
30 30 33 19 28 30 26 9 32 12 30 30 33 20 30 30 32 15 27 27 30 28 28 34
22 27 22 28 28 29 26 9 33 19 22 43 25 33 19 28 27 32 15 30 32 12 28 30
27 30 30 26 27 30 40 23 30 40 23 30 29 29 30 30 30 29 30
EOS
enum_fasta = Bio::FastaFormat.new(fasta).each_entry
enum_qual = Bio::FastaNumericFormat.new(qual).each_entry
loop do
fastq = Bio::Sequence.adapter(enum_fasta.next,
Bio::Sequence::Adapter::Fastq)
fastq.quality_score_type = :phred
fastq.quality_scores = enum_qual.next.data
puts fastq.output(:fastq)
end
--
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences
More information about the BioRuby
mailing list