[BioRuby] fastq files reading
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp
Sun May 30 14:31:55 UTC 2010
Hi,
The external itarator can be used with Ruby 1.8.7 or later.
(It can't be used with Ruby 1.8.6 or earlier.)
In addition, it takes many resources and is inefficient with current
Ruby implementation. (In the future, it will be optimized.)
I think using Bio::FlatFile#next_entry is good in this case.
The next_entry method returns nil after the end of file.
In the following example, "entry1" and "entry2" are checked
every time if they are not nil (in "if entry1 then ... end" and
"if entry2 then ... end"). If you believe the two files always have
the same number of entries, the checks can be skipped.
require 'bio'
ff1 = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
ff2 = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
while entry1 = ff1.next_entry or entrry2 = ff2.next_entry
if entry1 then
header1 = entry1.entry_id
seq1 = entry1.seq
puts seq1.to_fasta(header1 + "qwa")
end
if entry2 then
header2 = entry2.entry_id
seq2 = entry2.seq
puts seq2.to_fasta(header2 + "qwa")
end
end
ff2.close
ff1.close
> Hello xyz,
>
> You should be able to solve this problem by parallel iteration over the two
> files. An external iterator will be required here. You can call next on an
> external iterator to get the next object. It will raise a StopIteration
> exception when there is no more item to iterate over. You will have to add a
> case to handle that too.
>
> Give something like the following a try:
>
> require 'bio'
>
> #open the two files
> one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
> two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
>
> #get an external iterator for two
> two_iterator = two.to_enum
>
> #now iterate
> one.each do |ff1|
> ff1.each do |entry1|
>
> header1 = entry1.entry_id
> seq1 = entry1.seq
>
> puts seq1.to_fasta(header1 + "qwa")
>
> entry2 = two_iterator.next
> header2 = entry2.entry_id
> seq2 = entry2.seq
> puts seq2.to_fasta(header2 + "qwa")
> end
> end
>
> #close the files
> one.close
> two.close
>
> I did not have any fasta file to test it on, but it should work.
>
> On Sat, May 29, 2010 at 5:44 PM, xyz <mitlox at op.pl> wrote:
>
> > Hello,
> > I would like to read at the same time two fastq files in order to
> > save them to fasta file.
> >
> > require 'bio'
> > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
> > ff1.each do |entry1|
> >
> > header1 = entry1.entry_id
> > seq1 = entry1.seq
> >
> > puts seq1.to_fasta(header1 + "qwa")
> >
> > #header2 = entry2.entry_id
> > #seq2 = entry2.seq
> > #puts seq2.to_fasta(header2 + "qwa")
> > end
> > end
> >
> > I have already the following code, but unfortunately I do not know
> > how to read both files at the same time.
> >
> > How is it possible to read two files at the same time and write them
> > to fasta file?
> >
> > Thank you in advance.
> >
> > Best regards,
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
>
>
>
> --
> Anurag Priyam,
> 2nd Year Undergraduate,
> Department of Mechanical Engineering,
> IIT Kharagpur.
> +91-9775550642
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
--
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
More information about the BioRuby
mailing list