From k.hayashi.info at gmail.com  Sat May  1 09:47:47 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Sat, 1 May 2010 22:47:47 +0900
Subject: [BioRuby] participation in GSoC 2010
In-Reply-To: <20100430081029.GA7221@thebird.nl>
References: <x2tb51ee1fd1004272112j3242390eq6512f1c8e92dab53@mail.gmail.com> 
	<20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> 
	<o2wb51ee1fd1004290741x23936d9ew6505a2f6e8da347@mail.gmail.com> 
	<20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl>
Message-ID: <z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>

Hi,

Thank you so much for the explanation.

First of all, I will test the current codes of BioRuby with Ruby 1.9.2
in order to detect the classes witch don't work.
Then, I will try to find the ways to support 1.8 and 1.9 without
'if-then' statements, referring to other libraries.
If you know the appropriate libraries, please tell me them (rails is
the best example?).

kazuhiro

2010/4/30  <pjotr.public14 at thebird.nl>:
> Hi Kazuhiro,
>
> Please *reply* to the list.
>
> On Thu, Apr 29, 2010 at 11:41:14PM +0900, Kazuhiro Hayashi wrote:
>> At the moment, I am planning to put the code for 1.8.7 ,1.9.2 and ,if
>> possible, JRuby only in one code base.
>> I don't understand what the 'architecture' file is.
>> Could you tell me it in a little more detail?
>
> All 1.8 stuff goes into one file. All 1.9 in another. So there is
> clear separation. When running Ruby 1.8 only that file gets
> 'required'. In pseudo-code.
>
> if ruby_version<1.9
> ?if !isjvm?
> ? ?require 'bio/ruby-1.8'
> ?else
> ? ?require 'bio/ruby-jvm'
> ?end
> else
> ?require 'bio/ruby-1.9'
> end
>
> Implementation specific stuff will go into these files (if possible).
> Say you have a different println implementation, rather than
> sprinkling the code base with:
>
> if ruby_version<1.9
> ?if !isjvm?
> ? ?println_1 ...
> ?else
> ? ?println_2 ...
> ?end
> else
> ?println_3 ...
> end
>
> You would 'hide' that in the architecture files. So you just get one
> call in the source tree:
>
> ?println_arch ...
>
> with implementation in the different 'architecture' files.
>
> Pj.
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From pjotr.public14 at thebird.nl  Sat May  1 12:36:21 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sat, 1 May 2010 18:36:21 +0200
Subject: [BioRuby] participation in GSoC 2010
In-Reply-To: <z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>
References: <x2tb51ee1fd1004272112j3242390eq6512f1c8e92dab53@mail.gmail.com>
	<20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org>
	<o2wb51ee1fd1004290741x23936d9ew6505a2f6e8da347@mail.gmail.com>
	<20100430080825.GA6555@thebird.nl>
	<20100430081029.GA7221@thebird.nl>
	<z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>
Message-ID: <20100501163621.GA22397@thebird.nl>

On Sat, May 01, 2010 at 10:47:47PM +0900, Kazuhiro Hayashi wrote:
> If you know the appropriate libraries, please tell me them (rails is
> the best example?).

It probably is. I think it runs on all versions.

Pj.

From bonnalraoul at ingm.it  Mon May  3 04:41:26 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 3 May 2010 10:41:26 +0200
Subject: [BioRuby] Plug in system
Message-ID: <ac1809f6-5a66-4da3-a5e7-5bdc4d4349df@ingm.it>

How do we manage the plugin distribution and installation?
Do we plan to support gem or like Rails does ?
Using Rails' way, we need to adapt/port the plugin script and its generators.
Actually I prefer the Rails' plugin system, it seems to me more agile than creating and distributing gems if we consider github as our master contribution/distribution hub.
I think, at the beginning we can reuse only the rails init/config system.

Actually Rails is the main framework but I'd like to have your opinion on other frameworks available (ramaze).

Thanks. 
--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 02 006 623  26
fax: +39 02 006 623 46
http://www.ingm.it


From k.hayashi.info at gmail.com  Thu May  6 02:50:22 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Thu, 6 May 2010 15:50:22 +0900
Subject: [BioRuby] A class for GEO
Message-ID: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>

Hi,

I'd like to process GEO files with Ruby.
Are there any classes for GEO in BioRuby?

I found the related files in this repository.
http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1
http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray

However, I was not able to find the files in the latest version of BioRuby.
http://github.com/bioruby/bioruby/

best regards.

Kazuhiro Hayashi

-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988

From ngoto at gen-info.osaka-u.ac.jp  Thu May  6 03:48:32 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 6 May 2010 16:48:32 +0900
Subject: [BioRuby] A class for GEO
In-Reply-To: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
References: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
Message-ID: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>

Hi Kazuhiro,

The GEO support was first proposed in September 2008, but it was
pending because of some matters, and no one wanted to work to merge.

In this February, we agreed to include it, or to create a plug-in
package.

It would be great if you or someone can work to do so.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


On Thu, 6 May 2010 15:50:22 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi,
> 
> I'd like to process GEO files with Ruby.
> Are there any classes for GEO in BioRuby?
> 
> I found the related files in this repository.
> http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1
> http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray
> 
> However, I was not able to find the files in the latest version of BioRuby.
> http://github.com/bioruby/bioruby/
> 
> best regards.
> 
> Kazuhiro Hayashi
> 
> -- 
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From pjotr.public14 at thebird.nl  Thu May  6 13:49:17 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 6 May 2010 19:49:17 +0200
Subject: [BioRuby] A class for GEO
In-Reply-To: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>
References: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
	<20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <20100506174917.GA2188@thebird.nl>

On Thu, May 06, 2010 at 04:48:32PM +0900, Naohisa GOTO wrote:
> The GEO support was first proposed in September 2008, but it was
> pending because of some matters, and no one wanted to work to merge.

In my repository is a functional GEO XML file format parser, for GEO
experiments, array batches and microarray data. I used it for a meta
analysis.

The main issue was that my implementation caches downloaded XML files.
For 'security' reasons the implementation was rejected. And there were
some other minor syntax things. I never got round to making it Bioruby
compliant. 

I have lost interest in microarrays, but you can simply pull my
branch and have a look at the implementation. It is not my greatest
work, though the code is succinct enough.

It makes sense to make this a plugin.

I may pick it up again when there is enough deep sequencing data on
GEO.

Pj.

PS. Don't use the GEO SOFT format - it is not consistent.


From mitlox at op.pl  Sat May 29 08:14:04 2010
From: mitlox at op.pl (xyz)
Date: Sat, 29 May 2010 22:14:04 +1000
Subject: [BioRuby] fastq files reading
Message-ID: <20100529221404.0175ee75@wp01>

Hello,
I would like to read at the same time two fastq files in order to
save them to fasta file. 

require 'bio'
Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
  ff1.each do |entry1|

    header1 = entry1.entry_id
    seq1 = entry1.seq 
    
    puts seq1.to_fasta(header1 + "qwa")
    
    #header2 = entry2.entry_id
    #seq2 = entry2.seq 
    #puts seq2.to_fasta(header2 + "qwa")
  end
end

I have already the following code, but unfortunately I do not know
how to read both files at the same time. 

How is it possible to read two files at the same time and write them
to fasta file?

Thank you in advance.

Best regards,


From anurag08priyam at gmail.com  Sun May 30 07:27:30 2010
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Sun, 30 May 2010 16:57:30 +0530
Subject: [BioRuby] fastq files reading
In-Reply-To: <20100529221404.0175ee75@wp01>
References: <20100529221404.0175ee75@wp01>
Message-ID: <AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>

Hello xyz,

You should be able to solve this problem by parallel iteration over the two
files. An external iterator will be required  here. You can call next on an
external iterator to get the next object. It will raise a StopIteration
exception when there is no more item to iterate over. You will have to add a
case to handle that too.

Give something like the following a try:

require 'bio'

#open the two files
one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')

#get an external iterator for two
two_iterator = two.to_enum

#now iterate
one.each do |ff1|
 ff1.each do |entry1|

   header1 = entry1.entry_id
   seq1 = entry1.seq

   puts seq1.to_fasta(header1 + "qwa")

   entry2 = two_iterator.next
   header2 = entry2.entry_id
   seq2 = entry2.seq
   puts seq2.to_fasta(header2 + "qwa")
 end
end

#close the files
one.close
two.close

I did not have any fasta file to test it on, but it should work.

On Sat, May 29, 2010 at 5:44 PM, xyz <mitlox at op.pl> wrote:

> Hello,
> I would like to read at the same time two fastq files in order to
> save them to fasta file.
>
> require 'bio'
> Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
>  ff1.each do |entry1|
>
>    header1 = entry1.entry_id
>    seq1 = entry1.seq
>
>    puts seq1.to_fasta(header1 + "qwa")
>
>    #header2 = entry2.entry_id
>    #seq2 = entry2.seq
>    #puts seq2.to_fasta(header2 + "qwa")
>  end
> end
>
> I have already the following code, but unfortunately I do not know
> how to read both files at the same time.
>
> How is it possible to read two files at the same time and write them
> to fasta file?
>
> Thank you in advance.
>
> Best regards,
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


-- 
Anurag Priyam,
2nd Year Undergraduate,
Department of Mechanical Engineering,
IIT Kharagpur.
+91-9775550642

From ngoto at gen-info.osaka-u.ac.jp  Sun May 30 10:31:55 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 30 May 2010 23:31:55 +0900
Subject: [BioRuby] fastq files reading
In-Reply-To: <AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>
References: <20100529221404.0175ee75@wp01>
	<AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>
Message-ID: <20100530233154.1B8A.EEF6E030@gen-info.osaka-u.ac.jp>

Hi,

The external itarator can be used with Ruby 1.8.7 or later.
(It can't be used with Ruby 1.8.6 or earlier.)
In addition, it takes many resources and is inefficient with current
Ruby implementation. (In the future, it will be optimized.)

I think using Bio::FlatFile#next_entry is good in this case.
The next_entry method returns nil after the end of file.
In the following example, "entry1" and "entry2" are checked
every time if they are not nil (in "if entry1 then ... end" and
"if entry2 then ... end"). If you believe the two files always have
the same number of entries, the checks can be  skipped.

  require 'bio'
  ff1 = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
  ff2 = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
  while entry1 = ff1.next_entry or entrry2 = ff2.next_entry
    if entry1 then
      header1 = entry1.entry_id
      seq1 = entry1.seq
      puts seq1.to_fasta(header1 + "qwa")
    end
    if entry2 then
      header2 = entry2.entry_id
      seq2 = entry2.seq 
      puts seq2.to_fasta(header2 + "qwa")
    end
  end
  ff2.close
  ff1.close


> Hello xyz,
> 
> You should be able to solve this problem by parallel iteration over the two
> files. An external iterator will be required  here. You can call next on an
> external iterator to get the next object. It will raise a StopIteration
> exception when there is no more item to iterate over. You will have to add a
> case to handle that too.
> 
> Give something like the following a try:
> 
> require 'bio'
> 
> #open the two files
> one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
> two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
> 
> #get an external iterator for two
> two_iterator = two.to_enum
> 
> #now iterate
> one.each do |ff1|
>  ff1.each do |entry1|
> 
>    header1 = entry1.entry_id
>    seq1 = entry1.seq
> 
>    puts seq1.to_fasta(header1 + "qwa")
> 
>    entry2 = two_iterator.next
>    header2 = entry2.entry_id
>    seq2 = entry2.seq
>    puts seq2.to_fasta(header2 + "qwa")
>  end
> end
> 
> #close the files
> one.close
> two.close
> 
> I did not have any fasta file to test it on, but it should work.
> 
> On Sat, May 29, 2010 at 5:44 PM, xyz <mitlox at op.pl> wrote:
> 
> > Hello,
> > I would like to read at the same time two fastq files in order to
> > save them to fasta file.
> >
> > require 'bio'
> > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
> >  ff1.each do |entry1|
> >
> >    header1 = entry1.entry_id
> >    seq1 = entry1.seq
> >
> >    puts seq1.to_fasta(header1 + "qwa")
> >
> >    #header2 = entry2.entry_id
> >    #seq2 = entry2.seq
> >    #puts seq2.to_fasta(header2 + "qwa")
> >  end
> > end
> >
> > I have already the following code, but unfortunately I do not know
> > how to read both files at the same time.
> >
> > How is it possible to read two files at the same time and write them
> > to fasta file?
> >
> > Thank you in advance.
> >
> > Best regards,
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> 
> 
> 
> -- 
> Anurag Priyam,
> 2nd Year Undergraduate,
> Department of Mechanical Engineering,
> IIT Kharagpur.
> +91-9775550642
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


From k.hayashi.info at gmail.com  Sat May  1 13:47:47 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Sat, 1 May 2010 22:47:47 +0900
Subject: [BioRuby] participation in GSoC 2010
In-Reply-To: <20100430081029.GA7221@thebird.nl>
References: <x2tb51ee1fd1004272112j3242390eq6512f1c8e92dab53@mail.gmail.com> 
	<20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org> 
	<o2wb51ee1fd1004290741x23936d9ew6505a2f6e8da347@mail.gmail.com> 
	<20100430080825.GA6555@thebird.nl> <20100430081029.GA7221@thebird.nl>
Message-ID: <z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>

Hi,

Thank you so much for the explanation.

First of all, I will test the current codes of BioRuby with Ruby 1.9.2
in order to detect the classes witch don't work.
Then, I will try to find the ways to support 1.8 and 1.9 without
'if-then' statements, referring to other libraries.
If you know the appropriate libraries, please tell me them (rails is
the best example?).

kazuhiro

2010/4/30  <pjotr.public14 at thebird.nl>:
> Hi Kazuhiro,
>
> Please *reply* to the list.
>
> On Thu, Apr 29, 2010 at 11:41:14PM +0900, Kazuhiro Hayashi wrote:
>> At the moment, I am planning to put the code for 1.8.7 ,1.9.2 and ,if
>> possible, JRuby only in one code base.
>> I don't understand what the 'architecture' file is.
>> Could you tell me it in a little more detail?
>
> All 1.8 stuff goes into one file. All 1.9 in another. So there is
> clear separation. When running Ruby 1.8 only that file gets
> 'required'. In pseudo-code.
>
> if ruby_version<1.9
> ?if !isjvm?
> ? ?require 'bio/ruby-1.8'
> ?else
> ? ?require 'bio/ruby-jvm'
> ?end
> else
> ?require 'bio/ruby-1.9'
> end
>
> Implementation specific stuff will go into these files (if possible).
> Say you have a different println implementation, rather than
> sprinkling the code base with:
>
> if ruby_version<1.9
> ?if !isjvm?
> ? ?println_1 ...
> ?else
> ? ?println_2 ...
> ?end
> else
> ?println_3 ...
> end
>
> You would 'hide' that in the architecture files. So you just get one
> call in the source tree:
>
> ?println_arch ...
>
> with implementation in the different 'architecture' files.
>
> Pj.
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From pjotr.public14 at thebird.nl  Sat May  1 16:36:21 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sat, 1 May 2010 18:36:21 +0200
Subject: [BioRuby] participation in GSoC 2010
In-Reply-To: <z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>
References: <x2tb51ee1fd1004272112j3242390eq6512f1c8e92dab53@mail.gmail.com>
	<20100428053913.GA17564@thebird.nl> <4BD87D0F.60606@burnham.org>
	<o2wb51ee1fd1004290741x23936d9ew6505a2f6e8da347@mail.gmail.com>
	<20100430080825.GA6555@thebird.nl>
	<20100430081029.GA7221@thebird.nl>
	<z2xb51ee1fd1005010647zc6c87e5bm5a9013c2608e18ec@mail.gmail.com>
Message-ID: <20100501163621.GA22397@thebird.nl>

On Sat, May 01, 2010 at 10:47:47PM +0900, Kazuhiro Hayashi wrote:
> If you know the appropriate libraries, please tell me them (rails is
> the best example?).

It probably is. I think it runs on all versions.

Pj.


From bonnalraoul at ingm.it  Mon May  3 08:41:26 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 3 May 2010 10:41:26 +0200
Subject: [BioRuby] Plug in system
Message-ID: <ac1809f6-5a66-4da3-a5e7-5bdc4d4349df@ingm.it>

How do we manage the plugin distribution and installation?
Do we plan to support gem or like Rails does ?
Using Rails' way, we need to adapt/port the plugin script and its generators.
Actually I prefer the Rails' plugin system, it seems to me more agile than creating and distributing gems if we consider github as our master contribution/distribution hub.
I think, at the beginning we can reuse only the rails init/config system.

Actually Rails is the main framework but I'd like to have your opinion on other frameworks available (ramaze).

Thanks. 
--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 02 006 623  26
fax: +39 02 006 623 46
http://www.ingm.it


From k.hayashi.info at gmail.com  Thu May  6 06:50:22 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Thu, 6 May 2010 15:50:22 +0900
Subject: [BioRuby] A class for GEO
Message-ID: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>

Hi,

I'd like to process GEO files with Ruby.
Are there any classes for GEO in BioRuby?

I found the related files in this repository.
http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1
http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray

However, I was not able to find the files in the latest version of BioRuby.
http://github.com/bioruby/bioruby/

best regards.

Kazuhiro Hayashi

-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From ngoto at gen-info.osaka-u.ac.jp  Thu May  6 07:48:32 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 6 May 2010 16:48:32 +0900
Subject: [BioRuby] A class for GEO
In-Reply-To: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
References: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
Message-ID: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>

Hi Kazuhiro,

The GEO support was first proposed in September 2008, but it was
pending because of some matters, and no one wanted to work to merge.

In this February, we agreed to include it, or to create a plug-in
package.

It would be great if you or someone can work to do so.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


On Thu, 6 May 2010 15:50:22 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi,
> 
> I'd like to process GEO files with Ruby.
> Are there any classes for GEO in BioRuby?
> 
> I found the related files in this repository.
> http://github.com/search?q=bioruby+geo&type=Everything&repo=&langOverride=&start_value=1
> http://github.com/pjotrp/bioruby/tree/cd65f5b94d47990917226a569b431fef822c1585/lib/bio/db/microarray
> 
> However, I was not able to find the files in the latest version of BioRuby.
> http://github.com/bioruby/bioruby/
> 
> best regards.
> 
> Kazuhiro Hayashi
> 
> -- 
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From pjotr.public14 at thebird.nl  Thu May  6 17:49:17 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 6 May 2010 19:49:17 +0200
Subject: [BioRuby] A class for GEO
In-Reply-To: <20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>
References: <n2hb51ee1fd1005052350ya34b52a1g47257d8847bbd367@mail.gmail.com>
	<20100506074832.7959C1CBC46D@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <20100506174917.GA2188@thebird.nl>

On Thu, May 06, 2010 at 04:48:32PM +0900, Naohisa GOTO wrote:
> The GEO support was first proposed in September 2008, but it was
> pending because of some matters, and no one wanted to work to merge.

In my repository is a functional GEO XML file format parser, for GEO
experiments, array batches and microarray data. I used it for a meta
analysis.

The main issue was that my implementation caches downloaded XML files.
For 'security' reasons the implementation was rejected. And there were
some other minor syntax things. I never got round to making it Bioruby
compliant. 

I have lost interest in microarrays, but you can simply pull my
branch and have a look at the implementation. It is not my greatest
work, though the code is succinct enough.

It makes sense to make this a plugin.

I may pick it up again when there is enough deep sequencing data on
GEO.

Pj.

PS. Don't use the GEO SOFT format - it is not consistent.


From mitlox at op.pl  Sat May 29 12:14:04 2010
From: mitlox at op.pl (xyz)
Date: Sat, 29 May 2010 22:14:04 +1000
Subject: [BioRuby] fastq files reading
Message-ID: <20100529221404.0175ee75@wp01>

Hello,
I would like to read at the same time two fastq files in order to
save them to fasta file. 

require 'bio'
Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
  ff1.each do |entry1|

    header1 = entry1.entry_id
    seq1 = entry1.seq 
    
    puts seq1.to_fasta(header1 + "qwa")
    
    #header2 = entry2.entry_id
    #seq2 = entry2.seq 
    #puts seq2.to_fasta(header2 + "qwa")
  end
end

I have already the following code, but unfortunately I do not know
how to read both files at the same time. 

How is it possible to read two files at the same time and write them
to fasta file?

Thank you in advance.

Best regards,


From anurag08priyam at gmail.com  Sun May 30 11:27:30 2010
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Sun, 30 May 2010 16:57:30 +0530
Subject: [BioRuby] fastq files reading
In-Reply-To: <20100529221404.0175ee75@wp01>
References: <20100529221404.0175ee75@wp01>
Message-ID: <AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>

Hello xyz,

You should be able to solve this problem by parallel iteration over the two
files. An external iterator will be required  here. You can call next on an
external iterator to get the next object. It will raise a StopIteration
exception when there is no more item to iterate over. You will have to add a
case to handle that too.

Give something like the following a try:

require 'bio'

#open the two files
one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')

#get an external iterator for two
two_iterator = two.to_enum

#now iterate
one.each do |ff1|
 ff1.each do |entry1|

   header1 = entry1.entry_id
   seq1 = entry1.seq

   puts seq1.to_fasta(header1 + "qwa")

   entry2 = two_iterator.next
   header2 = entry2.entry_id
   seq2 = entry2.seq
   puts seq2.to_fasta(header2 + "qwa")
 end
end

#close the files
one.close
two.close

I did not have any fasta file to test it on, but it should work.

On Sat, May 29, 2010 at 5:44 PM, xyz <mitlox at op.pl> wrote:

> Hello,
> I would like to read at the same time two fastq files in order to
> save them to fasta file.
>
> require 'bio'
> Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
>  ff1.each do |entry1|
>
>    header1 = entry1.entry_id
>    seq1 = entry1.seq
>
>    puts seq1.to_fasta(header1 + "qwa")
>
>    #header2 = entry2.entry_id
>    #seq2 = entry2.seq
>    #puts seq2.to_fasta(header2 + "qwa")
>  end
> end
>
> I have already the following code, but unfortunately I do not know
> how to read both files at the same time.
>
> How is it possible to read two files at the same time and write them
> to fasta file?
>
> Thank you in advance.
>
> Best regards,
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


-- 
Anurag Priyam,
2nd Year Undergraduate,
Department of Mechanical Engineering,
IIT Kharagpur.
+91-9775550642


From ngoto at gen-info.osaka-u.ac.jp  Sun May 30 14:31:55 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 30 May 2010 23:31:55 +0900
Subject: [BioRuby] fastq files reading
In-Reply-To: <AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>
References: <20100529221404.0175ee75@wp01>
	<AANLkTil1Nrbd5ULu3T-esvbg8VXoCYojW53eCL72oihi@mail.gmail.com>
Message-ID: <20100530233154.1B8A.EEF6E030@gen-info.osaka-u.ac.jp>

Hi,

The external itarator can be used with Ruby 1.8.7 or later.
(It can't be used with Ruby 1.8.6 or earlier.)
In addition, it takes many resources and is inefficient with current
Ruby implementation. (In the future, it will be optimized.)

I think using Bio::FlatFile#next_entry is good in this case.
The next_entry method returns nil after the end of file.
In the following example, "entry1" and "entry2" are checked
every time if they are not nil (in "if entry1 then ... end" and
"if entry2 then ... end"). If you believe the two files always have
the same number of entries, the checks can be  skipped.

  require 'bio'
  ff1 = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
  ff2 = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
  while entry1 = ff1.next_entry or entrry2 = ff2.next_entry
    if entry1 then
      header1 = entry1.entry_id
      seq1 = entry1.seq
      puts seq1.to_fasta(header1 + "qwa")
    end
    if entry2 then
      header2 = entry2.entry_id
      seq2 = entry2.seq 
      puts seq2.to_fasta(header2 + "qwa")
    end
  end
  ff2.close
  ff1.close


> Hello xyz,
> 
> You should be able to solve this problem by parallel iteration over the two
> files. An external iterator will be required  here. You can call next on an
> external iterator to get the next object. It will raise a StopIteration
> exception when there is no more item to iterate over. You will have to add a
> case to handle that too.
> 
> Give something like the following a try:
> 
> require 'bio'
> 
> #open the two files
> one = Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq')
> two = Bio::FlatFile.open(Bio::Fastq, 'readsB.fastq')
> 
> #get an external iterator for two
> two_iterator = two.to_enum
> 
> #now iterate
> one.each do |ff1|
>  ff1.each do |entry1|
> 
>    header1 = entry1.entry_id
>    seq1 = entry1.seq
> 
>    puts seq1.to_fasta(header1 + "qwa")
> 
>    entry2 = two_iterator.next
>    header2 = entry2.entry_id
>    seq2 = entry2.seq
>    puts seq2.to_fasta(header2 + "qwa")
>  end
> end
> 
> #close the files
> one.close
> two.close
> 
> I did not have any fasta file to test it on, but it should work.
> 
> On Sat, May 29, 2010 at 5:44 PM, xyz <mitlox at op.pl> wrote:
> 
> > Hello,
> > I would like to read at the same time two fastq files in order to
> > save them to fasta file.
> >
> > require 'bio'
> > Bio::FlatFile.open(Bio::Fastq, 'readsA.fastq') do |ff1|
> >  ff1.each do |entry1|
> >
> >    header1 = entry1.entry_id
> >    seq1 = entry1.seq
> >
> >    puts seq1.to_fasta(header1 + "qwa")
> >
> >    #header2 = entry2.entry_id
> >    #seq2 = entry2.seq
> >    #puts seq2.to_fasta(header2 + "qwa")
> >  end
> > end
> >
> > I have already the following code, but unfortunately I do not know
> > how to read both files at the same time.
> >
> > How is it possible to read two files at the same time and write them
> > to fasta file?
> >
> > Thank you in advance.
> >
> > Best regards,
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> 
> 
> 
> -- 
> Anurag Priyam,
> 2nd Year Undergraduate,
> Department of Mechanical Engineering,
> IIT Kharagpur.
> +91-9775550642
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org