From k.hayashi.info at gmail.com Sat May 1 09:47:47 2010 From: k.hayashi.info at gmail.com (Kazuhiro Hayashi) Date: Sat, 1 May 2010 22:47:47 +0900 Subject: [BioRuby] participation in GSoC 2010 In-Reply-To: <20100430081029.GA7221@thebird.nl> References: <20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> <20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl> Message-ID: Hi, Thank you so much for the explanation. First of all, I will test the current codes of BioRuby with Ruby 1.9.2 in order to detect the classes witch don't work. Then, I will try to find the ways to support 1.8 and 1.9 without 'if-then' statements, referring to other libraries. If you know the appropriate libraries, please tell me them (rails is the best example?). kazuhiro 2010/4/30 : > Hi Kazuhiro, > > Please *reply* to the list. > > On Thu, Apr 29, 2010 at 11:41:14PM +0900, Kazuhiro Hayashi wrote: >> At the moment, I am planning to put the code for 1.8.7 ,1.9.2 and ,if >> possible, JRuby only in one code base. >> I don't understand what the 'architecture' file is. >> Could you tell me it in a little more detail? > > All 1.8 stuff goes into one file. All 1.9 in another. So there is > clear separation. When running Ruby 1.8 only that file gets > 'required'. In pseudo-code. > > if ruby_version<1.9 > ?if !isjvm? > ? ?require 'bio/ruby-1.8' > ?else > ? ?require 'bio/ruby-jvm' > ?end > else > ?require 'bio/ruby-1.9' > end > > Implementation specific stuff will go into these files (if possible). > Say you have a different println implementation, rather than > sprinkling the code base with: > > if ruby_version<1.9 > ?if !isjvm? > ? ?println_1 ... > ?else > ? ?println_2 ... > ?end > else > ?println_3 ... > end > > You would 'hide' that in the architecture files. So you just get one > call in the source tree: > > ?println_arch ... > > with implementation in the different 'architecture' files. > > Pj. > -- Kazuhiro Hayashi Department of Computational Biology, The University of Tokyo email: k_hayashi at cb.k.u-tokyo.ac.jp tel: 04-7136-3988 From pjotr.public14 at thebird.nl Sat May 1 12:36:21 2010 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sat, 1 May 2010 18:36:21 +0200 Subject: [BioRuby] participation in GSoC 2010 In-Reply-To: References: <20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> <20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl> Message-ID: <20100501163621.GA22397@thebird.nl> On Sat, May 01, 2010 at 10:47:47PM +0900, Kazuhiro Hayashi wrote: > If you know the appropriate libraries, please tell me them (rails is > the best example?). It probably is. I think it runs on all versions. Pj. From bonnalraoul at ingm.it Mon May 3 04:41:26 2010 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 3 May 2010 10:41:26 +0200 Subject: [BioRuby] Plug in system Message-ID: How do we manage the plugin distribution and installation? Do we plan to support gem or like Rails does ? Using Rails' way, we need to adapt/port the plugin script and its generators. Actually I prefer the Rails' plugin system, it seems to me more agile than creating and distributing gems if we consider github as our master contribution/distribution hub. I think, at the beginning we can reuse only the rails init/config system. Actually Rails is the main framework but I'd like to have your opinion on other frameworks available (ramaze). Thanks. -- Raoul J.P. Bonnal Life Science Informatics Integrative Biology Program Fondazione INGM Via F. Sforza 28 20122 Milano, IT phone: +39 02 006 623 26 fax: +39 02 006 623 46 http://www.ingm.it From k.hayashi.info at gmail.com Thu May 6 02:50:22 2010 From: k.hayashi.info at gmail.com (Kazuhiro Hayashi) Date: Thu, 6 May 2010 15:50:22 +0900 Subject: [BioRuby] A class for GEO Message-ID: Hi, I'd like to process GEO files with Ruby. Are there any classes for GEO in BioRuby? I found the related files in this repository. http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1 http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray However, I was not able to find the files in the latest version of BioRuby. http://github.com/bioruby/bioruby/ best regards. Kazuhiro Hayashi -- Kazuhiro Hayashi Department of Computational Biology, The University of Tokyo email: k_hayashi at cb.k.u-tokyo.ac.jp tel: 04-7136-3988 From ngoto at gen-info.osaka-u.ac.jp Thu May 6 03:48:32 2010 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Thu, 6 May 2010 16:48:32 +0900 Subject: [BioRuby] A class for GEO In-Reply-To: References: Message-ID: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> Hi Kazuhiro, The GEO support was first proposed in September 2008, but it was pending because of some matters, and no one wanted to work to merge. In this February, we agreed to include it, or to create a plug-in package. It would be great if you or someone can work to do so. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Thu, 6 May 2010 15:50:22 +0900 Kazuhiro Hayashi wrote: > Hi, > > I'd like to process GEO files with Ruby. > Are there any classes for GEO in BioRuby? > > I found the related files in this repository. > http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1 > http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray > > However, I was not able to find the files in the latest version of BioRuby. > http://github.com/bioruby/bioruby/ > > best regards. > > Kazuhiro Hayashi > > -- > Kazuhiro Hayashi > Department of Computational Biology, The University of Tokyo > email: k_hayashi at cb.k.u-tokyo.ac.jp > tel: 04-7136-3988 > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From pjotr.public14 at thebird.nl Thu May 6 13:49:17 2010 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 6 May 2010 19:49:17 +0200 Subject: [BioRuby] A class for GEO In-Reply-To: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> References: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> Message-ID: <20100506174917.GA2188@thebird.nl> On Thu, May 06, 2010 at 04:48:32PM +0900, Naohisa GOTO wrote: > The GEO support was first proposed in September 2008, but it was > pending because of some matters, and no one wanted to work to merge. In my repository is a functional GEO XML file format parser, for GEO experiments, array batches and microarray data. I used it for a meta analysis. The main issue was that my implementation caches downloaded XML files. For 'security' reasons the implementation was rejected. And there were some other minor syntax things. I never got round to making it Bioruby compliant. I have lost interest in microarrays, but you can simply pull my branch and have a look at the implementation. It is not my greatest work, though the code is succinct enough. It makes sense to make this a plugin. I may pick it up again when there is enough deep sequencing data on GEO. Pj. PS. Don't use the GEO SOFT format - it is not consistent. From mitlox at op.pl Sat May 29 08:14:04 2010 From: mitlox at op.pl (xyz) Date: Sat, 29 May 2010 22:14:04 +1000 Subject: [BioRuby] fastq files reading Message-ID: <20100529221404.0175ee75@wp01> Hello, I would like to read at the same time two fastq files in order to save them to fasta file. require 'bio' Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| ff1.each do |entry1| header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") #header2 = entry2.entry_id #seq2 = entry2.seq #puts seq2.to_fasta(header2 + "qwa") end end I have already the following code, but unfortunately I do not know how to read both files at the same time. How is it possible to read two files at the same time and write them to fasta file? Thank you in advance. Best regards, From anurag08priyam at gmail.com Sun May 30 07:27:30 2010 From: anurag08priyam at gmail.com (Anurag Priyam) Date: Sun, 30 May 2010 16:57:30 +0530 Subject: [BioRuby] fastq files reading In-Reply-To: <20100529221404.0175ee75@wp01> References: <20100529221404.0175ee75@wp01> Message-ID: Hello xyz, You should be able to solve this problem by parallel iteration over the two files. An external iterator will be required here. You can call next on an external iterator to get the next object. It will raise a StopIteration exception when there is no more item to iterate over. You will have to add a case to handle that too. Give something like the following a try: require 'bio' #open the two files one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') #get an external iterator for two two_iterator = two.to_enum #now iterate one.each do |ff1| ff1.each do |entry1| header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") entry2 = two_iterator.next header2 = entry2.entry_id seq2 = entry2.seq puts seq2.to_fasta(header2 + "qwa") end end #close the files one.close two.close I did not have any fasta file to test it on, but it should work. On Sat, May 29, 2010 at 5:44 PM, xyz wrote: > Hello, > I would like to read at the same time two fastq files in order to > save them to fasta file. > > require 'bio' > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| > ff1.each do |entry1| > > header1 = entry1.entry_id > seq1 = entry1.seq > > puts seq1.to_fasta(header1 + "qwa") > > #header2 = entry2.entry_id > #seq2 = entry2.seq > #puts seq2.to_fasta(header2 + "qwa") > end > end > > I have already the following code, but unfortunately I do not know > how to read both files at the same time. > > How is it possible to read two files at the same time and write them > to fasta file? > > Thank you in advance. > > Best regards, > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > -- Anurag Priyam, 2nd Year Undergraduate, Department of Mechanical Engineering, IIT Kharagpur. +91-9775550642 From ngoto at gen-info.osaka-u.ac.jp Sun May 30 10:31:55 2010 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto) Date: Sun, 30 May 2010 23:31:55 +0900 Subject: [BioRuby] fastq files reading In-Reply-To: References: <20100529221404.0175ee75@wp01> Message-ID: <20100530233154.1B8A.EEF6E030@gen-info.osaka-u.ac.jp> Hi, The external itarator can be used with Ruby 1.8.7 or later. (It can't be used with Ruby 1.8.6 or earlier.) In addition, it takes many resources and is inefficient with current Ruby implementation. (In the future, it will be optimized.) I think using Bio::FlatFile#next_entry is good in this case. The next_entry method returns nil after the end of file. In the following example, "entry1" and "entry2" are checked every time if they are not nil (in "if entry1 then ... end" and "if entry2 then ... end"). If you believe the two files always have the same number of entries, the checks can be skipped. require 'bio' ff1 = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') ff2 = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') while entry1 = ff1.next_entry or entrry2 = ff2.next_entry if entry1 then header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") end if entry2 then header2 = entry2.entry_id seq2 = entry2.seq puts seq2.to_fasta(header2 + "qwa") end end ff2.close ff1.close > Hello xyz, > > You should be able to solve this problem by parallel iteration over the two > files. An external iterator will be required here. You can call next on an > external iterator to get the next object. It will raise a StopIteration > exception when there is no more item to iterate over. You will have to add a > case to handle that too. > > Give something like the following a try: > > require 'bio' > > #open the two files > one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') > two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') > > #get an external iterator for two > two_iterator = two.to_enum > > #now iterate > one.each do |ff1| > ff1.each do |entry1| > > header1 = entry1.entry_id > seq1 = entry1.seq > > puts seq1.to_fasta(header1 + "qwa") > > entry2 = two_iterator.next > header2 = entry2.entry_id > seq2 = entry2.seq > puts seq2.to_fasta(header2 + "qwa") > end > end > > #close the files > one.close > two.close > > I did not have any fasta file to test it on, but it should work. > > On Sat, May 29, 2010 at 5:44 PM, xyz wrote: > > > Hello, > > I would like to read at the same time two fastq files in order to > > save them to fasta file. > > > > require 'bio' > > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| > > ff1.each do |entry1| > > > > header1 = entry1.entry_id > > seq1 = entry1.seq > > > > puts seq1.to_fasta(header1 + "qwa") > > > > #header2 = entry2.entry_id > > #seq2 = entry2.seq > > #puts seq2.to_fasta(header2 + "qwa") > > end > > end > > > > I have already the following code, but unfortunately I do not know > > how to read both files at the same time. > > > > How is it possible to read two files at the same time and write them > > to fasta file? > > > > Thank you in advance. > > > > Best regards, > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > > > -- > Anurag Priyam, > 2nd Year Undergraduate, > Department of Mechanical Engineering, > IIT Kharagpur. > +91-9775550642 > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org From k.hayashi.info at gmail.com Sat May 1 13:47:47 2010 From: k.hayashi.info at gmail.com (Kazuhiro Hayashi) Date: Sat, 1 May 2010 22:47:47 +0900 Subject: [BioRuby] participation in GSoC 2010 In-Reply-To: <20100430081029.GA7221@thebird.nl> References: <20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> <20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl> Message-ID: Hi, Thank you so much for the explanation. First of all, I will test the current codes of BioRuby with Ruby 1.9.2 in order to detect the classes witch don't work. Then, I will try to find the ways to support 1.8 and 1.9 without 'if-then' statements, referring to other libraries. If you know the appropriate libraries, please tell me them (rails is the best example?). kazuhiro 2010/4/30 : > Hi Kazuhiro, > > Please *reply* to the list. > > On Thu, Apr 29, 2010 at 11:41:14PM +0900, Kazuhiro Hayashi wrote: >> At the moment, I am planning to put the code for 1.8.7 ,1.9.2 and ,if >> possible, JRuby only in one code base. >> I don't understand what the 'architecture' file is. >> Could you tell me it in a little more detail? > > All 1.8 stuff goes into one file. All 1.9 in another. So there is > clear separation. When running Ruby 1.8 only that file gets > 'required'. In pseudo-code. > > if ruby_version<1.9 > ?if !isjvm? > ? ?require 'bio/ruby-1.8' > ?else > ? ?require 'bio/ruby-jvm' > ?end > else > ?require 'bio/ruby-1.9' > end > > Implementation specific stuff will go into these files (if possible). > Say you have a different println implementation, rather than > sprinkling the code base with: > > if ruby_version<1.9 > ?if !isjvm? > ? ?println_1 ... > ?else > ? ?println_2 ... > ?end > else > ?println_3 ... > end > > You would 'hide' that in the architecture files. So you just get one > call in the source tree: > > ?println_arch ... > > with implementation in the different 'architecture' files. > > Pj. > -- Kazuhiro Hayashi Department of Computational Biology, The University of Tokyo email: k_hayashi at cb.k.u-tokyo.ac.jp tel: 04-7136-3988 From pjotr.public14 at thebird.nl Sat May 1 16:36:21 2010 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sat, 1 May 2010 18:36:21 +0200 Subject: [BioRuby] participation in GSoC 2010 In-Reply-To: References: <20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> <20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl> Message-ID: <20100501163621.GA22397@thebird.nl> On Sat, May 01, 2010 at 10:47:47PM +0900, Kazuhiro Hayashi wrote: > If you know the appropriate libraries, please tell me them (rails is > the best example?). It probably is. I think it runs on all versions. Pj. From bonnalraoul at ingm.it Mon May 3 08:41:26 2010 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 3 May 2010 10:41:26 +0200 Subject: [BioRuby] Plug in system Message-ID: How do we manage the plugin distribution and installation? Do we plan to support gem or like Rails does ? Using Rails' way, we need to adapt/port the plugin script and its generators. Actually I prefer the Rails' plugin system, it seems to me more agile than creating and distributing gems if we consider github as our master contribution/distribution hub. I think, at the beginning we can reuse only the rails init/config system. Actually Rails is the main framework but I'd like to have your opinion on other frameworks available (ramaze). Thanks. -- Raoul J.P. Bonnal Life Science Informatics Integrative Biology Program Fondazione INGM Via F. Sforza 28 20122 Milano, IT phone: +39 02 006 623 26 fax: +39 02 006 623 46 http://www.ingm.it From k.hayashi.info at gmail.com Thu May 6 06:50:22 2010 From: k.hayashi.info at gmail.com (Kazuhiro Hayashi) Date: Thu, 6 May 2010 15:50:22 +0900 Subject: [BioRuby] A class for GEO Message-ID: Hi, I'd like to process GEO files with Ruby. Are there any classes for GEO in BioRuby? I found the related files in this repository. http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1 http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray However, I was not able to find the files in the latest version of BioRuby. http://github.com/bioruby/bioruby/ best regards. Kazuhiro Hayashi -- Kazuhiro Hayashi Department of Computational Biology, The University of Tokyo email: k_hayashi at cb.k.u-tokyo.ac.jp tel: 04-7136-3988 From ngoto at gen-info.osaka-u.ac.jp Thu May 6 07:48:32 2010 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Thu, 6 May 2010 16:48:32 +0900 Subject: [BioRuby] A class for GEO In-Reply-To: References: Message-ID: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> Hi Kazuhiro, The GEO support was first proposed in September 2008, but it was pending because of some matters, and no one wanted to work to merge. In this February, we agreed to include it, or to create a plug-in package. It would be great if you or someone can work to do so. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Thu, 6 May 2010 15:50:22 +0900 Kazuhiro Hayashi wrote: > Hi, > > I'd like to process GEO files with Ruby. > Are there any classes for GEO in BioRuby? > > I found the related files in this repository. > http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1 > http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray > > However, I was not able to find the files in the latest version of BioRuby. > http://github.com/bioruby/bioruby/ > > best regards. > > Kazuhiro Hayashi > > -- > Kazuhiro Hayashi > Department of Computational Biology, The University of Tokyo > email: k_hayashi at cb.k.u-tokyo.ac.jp > tel: 04-7136-3988 > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From pjotr.public14 at thebird.nl Thu May 6 17:49:17 2010 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 6 May 2010 19:49:17 +0200 Subject: [BioRuby] A class for GEO In-Reply-To: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> References: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp> Message-ID: <20100506174917.GA2188@thebird.nl> On Thu, May 06, 2010 at 04:48:32PM +0900, Naohisa GOTO wrote: > The GEO support was first proposed in September 2008, but it was > pending because of some matters, and no one wanted to work to merge. In my repository is a functional GEO XML file format parser, for GEO experiments, array batches and microarray data. I used it for a meta analysis. The main issue was that my implementation caches downloaded XML files. For 'security' reasons the implementation was rejected. And there were some other minor syntax things. I never got round to making it Bioruby compliant. I have lost interest in microarrays, but you can simply pull my branch and have a look at the implementation. It is not my greatest work, though the code is succinct enough. It makes sense to make this a plugin. I may pick it up again when there is enough deep sequencing data on GEO. Pj. PS. Don't use the GEO SOFT format - it is not consistent. From mitlox at op.pl Sat May 29 12:14:04 2010 From: mitlox at op.pl (xyz) Date: Sat, 29 May 2010 22:14:04 +1000 Subject: [BioRuby] fastq files reading Message-ID: <20100529221404.0175ee75@wp01> Hello, I would like to read at the same time two fastq files in order to save them to fasta file. require 'bio' Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| ff1.each do |entry1| header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") #header2 = entry2.entry_id #seq2 = entry2.seq #puts seq2.to_fasta(header2 + "qwa") end end I have already the following code, but unfortunately I do not know how to read both files at the same time. How is it possible to read two files at the same time and write them to fasta file? Thank you in advance. Best regards, From anurag08priyam at gmail.com Sun May 30 11:27:30 2010 From: anurag08priyam at gmail.com (Anurag Priyam) Date: Sun, 30 May 2010 16:57:30 +0530 Subject: [BioRuby] fastq files reading In-Reply-To: <20100529221404.0175ee75@wp01> References: <20100529221404.0175ee75@wp01> Message-ID: Hello xyz, You should be able to solve this problem by parallel iteration over the two files. An external iterator will be required here. You can call next on an external iterator to get the next object. It will raise a StopIteration exception when there is no more item to iterate over. You will have to add a case to handle that too. Give something like the following a try: require 'bio' #open the two files one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') #get an external iterator for two two_iterator = two.to_enum #now iterate one.each do |ff1| ff1.each do |entry1| header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") entry2 = two_iterator.next header2 = entry2.entry_id seq2 = entry2.seq puts seq2.to_fasta(header2 + "qwa") end end #close the files one.close two.close I did not have any fasta file to test it on, but it should work. On Sat, May 29, 2010 at 5:44 PM, xyz wrote: > Hello, > I would like to read at the same time two fastq files in order to > save them to fasta file. > > require 'bio' > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| > ff1.each do |entry1| > > header1 = entry1.entry_id > seq1 = entry1.seq > > puts seq1.to_fasta(header1 + "qwa") > > #header2 = entry2.entry_id > #seq2 = entry2.seq > #puts seq2.to_fasta(header2 + "qwa") > end > end > > I have already the following code, but unfortunately I do not know > how to read both files at the same time. > > How is it possible to read two files at the same time and write them > to fasta file? > > Thank you in advance. > > Best regards, > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > -- Anurag Priyam, 2nd Year Undergraduate, Department of Mechanical Engineering, IIT Kharagpur. +91-9775550642 From ngoto at gen-info.osaka-u.ac.jp Sun May 30 14:31:55 2010 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto) Date: Sun, 30 May 2010 23:31:55 +0900 Subject: [BioRuby] fastq files reading In-Reply-To: References: <20100529221404.0175ee75@wp01> Message-ID: <20100530233154.1B8A.EEF6E030@gen-info.osaka-u.ac.jp> Hi, The external itarator can be used with Ruby 1.8.7 or later. (It can't be used with Ruby 1.8.6 or earlier.) In addition, it takes many resources and is inefficient with current Ruby implementation. (In the future, it will be optimized.) I think using Bio::FlatFile#next_entry is good in this case. The next_entry method returns nil after the end of file. In the following example, "entry1" and "entry2" are checked every time if they are not nil (in "if entry1 then ... end" and "if entry2 then ... end"). If you believe the two files always have the same number of entries, the checks can be skipped. require 'bio' ff1 = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') ff2 = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') while entry1 = ff1.next_entry or entrry2 = ff2.next_entry if entry1 then header1 = entry1.entry_id seq1 = entry1.seq puts seq1.to_fasta(header1 + "qwa") end if entry2 then header2 = entry2.entry_id seq2 = entry2.seq puts seq2.to_fasta(header2 + "qwa") end end ff2.close ff1.close > Hello xyz, > > You should be able to solve this problem by parallel iteration over the two > files. An external iterator will be required here. You can call next on an > external iterator to get the next object. It will raise a StopIteration > exception when there is no more item to iterate over. You will have to add a > case to handle that too. > > Give something like the following a try: > > require 'bio' > > #open the two files > one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') > two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq') > > #get an external iterator for two > two_iterator = two.to_enum > > #now iterate > one.each do |ff1| > ff1.each do |entry1| > > header1 = entry1.entry_id > seq1 = entry1.seq > > puts seq1.to_fasta(header1 + "qwa") > > entry2 = two_iterator.next > header2 = entry2.entry_id > seq2 = entry2.seq > puts seq2.to_fasta(header2 + "qwa") > end > end > > #close the files > one.close > two.close > > I did not have any fasta file to test it on, but it should work. > > On Sat, May 29, 2010 at 5:44 PM, xyz wrote: > > > Hello, > > I would like to read at the same time two fastq files in order to > > save them to fasta file. > > > > require 'bio' > > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1| > > ff1.each do |entry1| > > > > header1 = entry1.entry_id > > seq1 = entry1.seq > > > > puts seq1.to_fasta(header1 + "qwa") > > > > #header2 = entry2.entry_id > > #seq2 = entry2.seq > > #puts seq2.to_fasta(header2 + "qwa") > > end > > end > > > > I have already the following code, but unfortunately I do not know > > how to read both files at the same time. > > > > How is it possible to read two files at the same time and write them > > to fasta file? > > > > Thank you in advance. > > > > Best regards, > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > > > -- > Anurag Priyam, > 2nd Year Undergraduate, > Department of Mechanical Engineering, > IIT Kharagpur. > +91-9775550642 > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org