[BioRuby] Fastq.to_s

Naohisa Goto ngoto at gen-info.osaka-u.ac.jp
Mon Aug 22 06:09:49 UTC 2011


Hi,

In this case, Bio::FlatFile#entry_raw, which returns the last
entry's string in the flat-file object,  is recommended, from
the viewpoint of performance (not to create additional objects).

modified example:
  require 'bio'
  
  ff1 = Bio::FlatFile.open(nil, ARGV[0])
  ff2 = Bio::FlatFile.open(nil, ARGV[1])

  ff1.each_entry do |fe1|
    fe1_raw = ff1.entry_raw
    fe2 = ff2.next_entry
    fe2_raw = ff2.entry_raw
   print fe1_raw
   print fe2
 end

Note that the example will not correctly work when the
two files contain different number of sequences.

I also agree Fastq#to_s  as a convenience method
regardless of performance.

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

> Hi,
> 
> For flatfiles I think its nice if we can output the original text entries as split.
> For example 
> 
> #!/bin/env ruby
> 
> require 'bio'
> 
> ff1 = Bio::FlatFile.open(nil, ARGV[0])
> ff2 = Bio::FlatFile.open(nil, ARGV[1])
> 
> ff1.each_entry do |fe1|
>   fe2 = ff2.next_entry
>   puts fe1
>   puts fe2
> end
> 
> should be able to merge read1 and read2 in different file to a single file.
> This does work with fasta format but not with fastq format right now, because
> Bio::Fastq does not have to_s method.  As Fastq does not hold really original 
> data, reconstructing as the following patch is perhaps a good way (don't use
> twice memory just for the to_s function). Or, do we need to fold the sequence
> to some (original or fixed) length?
> 
> diff --git a/lib/bio/db/fastq.rb b/lib/bio/db/fastq.rb
> index f913e6d..5ff1a15 100644
> --- a/lib/bio/db/fastq.rb
> +++ b/lib/bio/db/fastq.rb
> @@ -407,6 +407,10 @@ class Fastq
>    # raw sequence data as a String object
>    attr_reader :sequence_string
>  
> +  def to_s
> +    "@#{@definition}\n#{@sequence_string}\n+#{@definition2}\n#{@quality_string}\n"
> +  end
> +
>    # returns Bio::Sequence::NA
>    def naseq
>      unless defined? @naseq then
> 
> Best regards,
> -- 
> Tomoaki NISHIYAMA
> 
> Advanced Science Research Center,
> Kanazawa University,
> 13-1 Takara-machi, 
> Kanazawa, 920-0934, Japan
> 





More information about the BioRuby mailing list