From ktym at hgc.jp  Thu Sep  1 03:34:59 2005
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu Sep  1 03:52:27 2005
Subject: [BioRuby] BioRuby 0.6.4 released
Message-ID: <D7D8B130-1453-4B1A-8A66-DB8281B5F2D2@hgc.jp>

Hello,

We have released BioRuby 0.6.4 at

  http://bioruby.org/archive/bioruby-0.6.4.tar.gz

# Note that 0.6.3 was not announced as the some features
# made backward incompatible.

This release contains following improvements and a lot of bug fixes.

* siRNA designer class is contributed by Itoshi Nikaido.
  (lib/bio/util/sirna.rb)

* fastacmd wrapper is contributed by Shinji Shigenobu.
  (lib/bio/io/fastacmd.rb)

* bl2seq parser is contributed by Tomoaki Nishiyama.
  (lib/bio/appl/bl2seq/report.rb)

* new application execution factory is provided.
  (lib/bio/command.rb)

* FlatFile class can accept Blast results, Spidey, Blat, Sim4 and
  some KEGG formats (KO, GLYCAN, REACTION).

* some methods are added to SPTr class proposed by Luca Pireddu.
  (lib/bio/db/embl/sptr.rb)

* external2go parser is added. (lib/bio/db/go.rb)

* improved amino/nucleic data classes to have some handy methods.
  (lib/bio/data/)

* fixed hmmer parser (by Masashi Fujita) and remote execution of
  blast and fasta using GenomeNet.

* some English documentations are added. (doc/)

See ChangeLog file for more detail.


On 2005/08/31, at 22:52, GOTO Naohisa wrote:
> I put ./doc/Tutorial.rd into BioRuby CVS repository.

This document and the KEGG API manual are included in 0.6.4 release.

Regards,
Toshiaki Katayama

From ktym at hgc.jp  Thu Sep  1 05:30:20 2005
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu Sep  1 05:20:05 2005
Subject: [BioRuby] BioRuby Documentation effort
In-Reply-To: <4314C59B.6070306@corevx.com>
References: <20050613211634.GA28151@tm2.nmi-agro.nl>	<20050822175542.GA25133@tm2.nmi-agro.nl>	<200508240359.j7O3xCTv032390@portal.open-bio.org>	<20050824145606.GA6031@tm2.nmi-agro.nl>
	<546CB664-E9A4-4012-85D2-F787E464C0CE@hgc.jp>
	<4314C59B.6070306@corevx.com>
Message-ID: <0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>

Thanks Trevor,

We will update all documents with RDoc format in next few months.
English correction is also appreciated.

Regards,
Toshiaki Katayama

On 2005/08/31, at 5:46, Trevor Wennblom wrote:

> I think now would be a good time to start documenting the source code with RDoc formatted comments.  For ease of use, I've included generated RDoc webpages here:
>
> http://corevx.com/bioruby-doc/
> http://corevx.com/bioruby-dev-doc/
>
> The only difference between the two is that 'bioruby-dev-doc' includes private and protected objects.
>
> I plan on keeping these up-to-date every week or two.  I hope this helps.
>
> Trevor
> _______________________________________________
> BioRuby mailing list
> BioRuby@open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioruby
>

From trevor at corevx.com  Thu Sep  8 18:05:13 2005
From: trevor at corevx.com (Trevor Wennblom)
Date: Thu Sep  8 17:54:09 2005
Subject: [BioRuby] More Bl2seq questions
In-Reply-To: <0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>
References: <20050613211634.GA28151@tm2.nmi-agro.nl>	<20050822175542.GA25133@tm2.nmi-agro.nl>	<200508240359.j7O3xCTv032390@portal.open-bio.org>	<20050824145606.GA6031@tm2.nmi-agro.nl>	<546CB664-E9A4-4012-85D2-F787E464C0CE@hgc.jp>	<4314C59B.6070306@corevx.com>
	<0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>
Message-ID: <4320B599.8060507@corevx.com>

Another question regarding Bl2seq.

This works:
  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
do |ff|
    ff.each do |rep|
      rep.iterations.each do |itr|
        itr.hits.each do |hit|
          a = hit.identity
        end
      end
    end
  end

But this doesn't:
  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
do |ff|
    ff.each do |rep|
      rep.iterations.each do |itr|
        itr.hits.each do |hit|
          a = hit.percent_identity
        end
      end
    end
  end

To make it work I have to go down to the HSP level: 
  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
do |ff|
    ff.each do |rep|
      rep.iterations.each do |itr|
        itr.hits.each do |hit|
          hit.hsps.each do |hsp|
            a = hsp.percent_identity
          end
        end
      end
    end
  end


Shouldn't we be able to see percent_identity, percent_positive, 
positive, etc. on the hit level?

Trevor
From trevor at corevx.com  Thu Sep  8 18:10:54 2005
From: trevor at corevx.com (Trevor Wennblom)
Date: Thu Sep  8 17:59:50 2005
Subject: [BioRuby] Problems with Bl2seq parser
Message-ID: <4320B6EE.9030502@corevx.com>

Hi, I seem to be running into some problems with the new Bl2seq parser.

Ideally I'd like to be able to read the report the same way as blast 
reports:
 Bio::Blast::Bl2seq::Report.xmlparser(filename_bl2seq_output).each do |hit|
   a = hit.identity
 end

xmlparser is undefined however, so we can try this:
 Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).each do |hit|
   a = hit.identity
 end

This doesn't give an error, but it wont read the report either.  This 
can be double checked with:
Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).to_yaml

What I have to do is the following:
 Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
do |ff|
   ff.each do |rep|
     rep.iterations.each do |itr|
       itr.hits.each do |hit|
         a = hit.identity
       end
     end
   end
 end

That's an awful lot of work for what should be a one-liner.  Am I doing 
something wrong here?

Thanks,
Trevor

From trevor at corevx.com  Thu Sep  8 17:49:57 2005
From: trevor at corevx.com (Trevor Wennblom)
Date: Thu Sep  8 18:20:43 2005
Subject: [BioRuby] Problems with Bl2seq parser
In-Reply-To: <0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>
References: <20050613211634.GA28151@tm2.nmi-agro.nl>	<20050822175542.GA25133@tm2.nmi-agro.nl>	<200508240359.j7O3xCTv032390@portal.open-bio.org>	<20050824145606.GA6031@tm2.nmi-agro.nl>	<546CB664-E9A4-4012-85D2-F787E464C0CE@hgc.jp>	<4314C59B.6070306@corevx.com>
	<0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>
Message-ID: <4320B205.7050002@corevx.com>

Hi, I seem to be running into some problems with the new Bl2seq parser.

Ideally I'd like to be able to read the report the same way as blast 
reports:
  Bio::Blast::Bl2seq::Report.xmlparser(filename_bl2seq_output).each do |hit|
    a = hit.identity
  end

xmlparser is undefined however, so we can try this:
  Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).each do |hit|
    a = hit.identity
  end

This doesn't give an error, but it wont read the report either.  This 
can be double checked with:
Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).to_yaml

What I have to do is the following:
  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
do |ff|
    ff.each do |rep|
      rep.iterations.each do |itr|
        itr.hits.each do |hit|
          a = hit.identity
        end
      end
    end
  end

That's an awful lot of work for what should be a one-liner.  Am I doing 
something wrong here?

Thanks,
Trevor
From ngoto at gen-info.osaka-u.ac.jp  Sun Sep 11 09:20:46 2005
From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa)
Date: Sun Sep 11 09:23:42 2005
Subject: [BioRuby] Problems with Bl2seq parser
In-Reply-To: <4320B6EE.9030502@corevx.com>
References: <4320B6EE.9030502@corevx.com>
Message-ID: <200509111320.j8BDKjmm005900@idns103.gen-info.osaka-u.ac.jp>

Hi,

On Thu, 08 Sep 2005 17:10:54 -0500
Trevor Wennblom <trevor@corevx.com> wrote:

> Hi, I seem to be running into some problems with the new Bl2seq parser.
> 
> Ideally I'd like to be able to read the report the same way as blast 
> reports:
>  Bio::Blast::Bl2seq::Report.xmlparser(filename_bl2seq_output).each do |hit|
>    a = hit.identity
>  end
> 
> xmlparser is undefined however, so we can try this:
>  Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).each do |hit|
>    a = hit.identity
>  end
> 
> This doesn't give an error, but it wont read the report either.  This 
> can be double checked with:
> Bio::Blast::Bl2seq::Report.new(filename_bl2seq_output).to_yaml


Because Report.new gets string (not filename), you can write as follows
on Ruby 1.8.0 or later:

  Bio::Blast::Bl2seq::Report.new(IO.read(filename_bl2seq_output)).each do |hit|
    a = hit.identity
  end

> What I have to do is the following:
>  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
> do |ff|
>    ff.each do |rep|
>      rep.iterations.each do |itr|
>        itr.hits.each do |hit|
>          a = hit.identity
>        end
>      end
>    end
>  end

In Bio::Blast::Bl2seq::Report, Iteration is prepared only for compatibility. 
You can skip iterations. In addition, rep.each works nearly same as
rep.hits.each.

  Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) do |ff|
    ff.each do |rep|
      rep.each do |hit|
        a = hit.identity
      end
    end
  end

> That's an awful lot of work for what should be a one-liner.  Am I doing 
> something wrong here?
> 
> Thanks,
> Trevor

We're sorry for lack of documents.

-- 
Naohisa GOTO
ngoto@gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan
From ngoto at gen-info.osaka-u.ac.jp  Sun Sep 11 09:41:08 2005
From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa)
Date: Sun Sep 11 09:32:17 2005
Subject: [BioRuby] More Bl2seq questions
In-Reply-To: <4320B599.8060507@corevx.com>
References: <20050613211634.GA28151@tm2.nmi-agro.nl>
	<20050822175542.GA25133@tm2.nmi-agro.nl>
	<200508240359.j7O3xCTv032390@portal.open-bio.org>
	<20050824145606.GA6031@tm2.nmi-agro.nl>
	<546CB664-E9A4-4012-85D2-F787E464C0CE@hgc.jp>
	<4314C59B.6070306@corevx.com>
	<0F62A9F5-1FF4-468B-87E2-A3184CB90C3E@hgc.jp>
	<4320B599.8060507@corevx.com>
Message-ID: <200509111330.j8BDU6AH025775@portal.open-bio.org>

Hi,

On Thu, 08 Sep 2005 17:05:13 -0500
Trevor Wennblom <trevor@corevx.com> wrote:

> Another question regarding Bl2seq.
> 
> This works:
>   Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
> do |ff|
>     ff.each do |rep|
>       rep.iterations.each do |itr|
>         itr.hits.each do |hit|
>           a = hit.identity
>         end
>       end
>     end
>   end
> 
> But this doesn't:
>   Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
> do |ff|
>     ff.each do |rep|
>       rep.iterations.each do |itr|
>         itr.hits.each do |hit|
>           a = hit.percent_identity
>         end
>       end
>     end
>   end
>
> To make it work I have to go down to the HSP level: 
>   Bio::FlatFile.open(Bio::Blast::Bl2seq::Report, filename_bl2seq_output) 
> do |ff|
>     ff.each do |rep|
>       rep.iterations.each do |itr|
>         itr.hits.each do |hit|
>           hit.hsps.each do |hsp|
>             a = hsp.percent_identity
>           end
>         end
>       end
>     end
>   end
> 
> 
> Shouldn't we be able to see percent_identity, percent_positive, 
> positive, etc. on the hit level?

This is because bl2seq (and BLAST) reports percent_identity on every
Hsp and does not report percent_identity for Hit. Hit#identity is only
a shortcut method of hsps[0].identity. It is prepared because of keeping
compatibility(polymorphism) with Fasta::Hit class which doesn't have Hsp.
percent_identity is not prepared for Hit object because Fasta::Hit does
not have it. Please use hsps[0].percent_identity instead. Note that
percent identity of each Hsp means identity / alignment length of Hsp
and cannot always be applyed to hit.

-- 
Naohisa GOTO
ngoto@gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan

From pjotr at pckassa.com  Fri Sep 16 03:04:53 2005
From: pjotr at pckassa.com (Pjotr Prins)
Date: Fri Sep 16 03:26:52 2005
Subject: [BioRuby] BioRuby Documentation effort
In-Reply-To: <200508311342.j7VDg8AH028193@portal.open-bio.org>
References: <20050613211634.GA28151@tm2.nmi-agro.nl>
	<20050822175542.GA25133@tm2.nmi-agro.nl>
	<200508311250.j7VCoPiQ016228@idns103.gen-info.osaka-u.ac.jp>
	<20050831131130.GA14210@tm2.nmi-agro.nl>
	<200508311342.j7VDg8AH028193@portal.open-bio.org>
Message-ID: <20050916070453.GA20133@tm2.nmi-agro.nl>

I edited the tutorial - and it is in CVS. I think it is a good start,
but what we also need is more on how a biologist should approach Ruby
and BioRuby. I.e. most of the examples are centered around access and 
some minor sequence alteration. What other good examples can we think
of that puts a distance between BioConductor and/or BioPerl, for
example?

Secondly I would like to use some system where examples double as unit 
tests - like BioConductor and Python have. That should make certain
examples are always correct. Any ideas on this?

Pj.

On Wed, Aug 31, 2005 at 10:52:59PM +0900, GOTO Naohisa wrote:
> Hi,
> 
> > I translated Tutorial.rd.ja into English. 
> > It will soon be put into the ./doc directory in CVS.
> 
> I put ./doc/Tutorial.rd into BioRuby CVS repository.
> 
> I changed name of Wiki page from Tutorial.rd.en to Tutorial.rd.
> http://bioruby.org/wiki/English/?Tutorial.rd
> 
> In the document, I wrote many "TRANSLATOR'S NOTE". They are
> not only memos for readers but also reminders to change original
> Japanese document.
> 
> On Wed, 31 Aug 2005 15:11:30 +0200
> pjotr@pckassa.com (Pjotr Prins) wrote:
> 
> > Great! I will work on it.
> 
> Thanks!!!
> 
> -- 
> Naohisa GOTO
> ngoto@gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
> _______________________________________________
> BioRuby mailing list
> BioRuby@open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioruby
From email at woodsc.ca  Wed Sep 21 00:05:02 2005
From: email at woodsc.ca (Conan K Woods)
Date: Wed Sep 21 00:35:01 2005
Subject: [BioRuby] ABI Chromatograms
Message-ID: <20050921040502.GE22598@f00f.net>

Hi, I was wondering if it would be useful to have an object representing
ABI chromatograms in the bioruby package.  I've coded up an Chromatogram
object in perl a little while ago for a project I work on, and was
thinking it might be useful in ruby?

Unfortunately my knowledge of bioruby package is kind of limited(Its
hard to find good documentation!) so I'm not sure where it would fit in
with the rest of the code.  If anyone is interested in this, I'll get
coding sometime later this week.

Conan K Woods

From email at woodsc.ca  Wed Sep 21 16:13:02 2005
From: email at woodsc.ca (Conan K Woods)
Date: Wed Sep 21 16:13:05 2005
Subject: [BioRuby] alignment.rb
Message-ID: <20050921201302.GG22598@f00f.net>

I noticed that their was a file bio/alignment.rb.  This sounded 
interesting(and might be of use for me on another project) and I was 
wondering what its for and how its used?  It looked like it could be used 
to interface with an alignment program, but I'm not sure how it is used 
that way.


Conan K Woods


From ngoto at gen-info.osaka-u.ac.jp  Thu Sep 22 12:19:14 2005
From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa)
Date: Thu Sep 22 12:35:31 2005
Subject: [BioRuby] alignment.rb
In-Reply-To: <20050921201302.GG22598@f00f.net>
References: <20050921201302.GG22598@f00f.net>
Message-ID: <200509221619.j8MGJf99020874@idns103.gen-info.osaka-u.ac.jp>

Hi,
I'm one of authors of alignment.rb.

On Wed, 21 Sep 2005 13:13:02 -0700
Conan K Woods <email@woodsc.ca> wrote:

> I noticed that their was a file bio/alignment.rb.  This sounded 
> interesting(and might be of use for me on another project) and I was 
> wondering what its for and how its used?  It looked like it could be used 
> to interface with an alignment program, but I'm not sure how it is used 
> that way.

Bio::Alignment class in bio/alignment.rb is a container class
like Ruby's Hash, Array and BioPerl's Bio::SimpleAlign.
A very simple example is:

  require 'bio'

  seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
  seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }

  # creates alignment object
  a = Bio::Alignment.new(seqs)

  # shows consensus sequence
  p a.consensus             # ==> "a?gc?"

  # shows IUPAC consensus
  p a.consensus_iupac       # ==> "ahgcr"

  # iterates over each seq
  a.each { |x| p x }
    # ==>
    #    "atgca"
    #    "aagca"
    #    "acgca"
    #    "acgcg"

  # iterates over each site
  a.each_site { |x| p x }
    # ==>
    #    ["a", "a", "a", "a"]
    #    ["t", "a", "c", "c"]
    #    ["g", "g", "g", "g"]
    #    ["c", "c", "c", "c"]
    #    ["a", "a", "a", "g"]

  # doing alignment by using CLUSTAL W.
  # clustalw command must be installed.
  factory = Bio::ClustalW.new
  a2 = a.do_align(factory)

Note that Bio::Alignment has more methods.
Becase it has too many methods and it is very complicated,
I'm planning to do refactoring and splitting its functions into
some modules. So, specs and usages of methods might be
changed in the near future.

-- 
Naohisa GOTO
ngoto@gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan
From email at woodsc.ca  Thu Sep 22 14:27:13 2005
From: email at woodsc.ca (Conan K Woods)
Date: Thu Sep 22 14:28:23 2005
Subject: [BioRuby] alignment.rb
In-Reply-To: <20050922162008.BEF9B7F486@heap.f00f.net>
References: <20050921201302.GG22598@f00f.net>
	<20050922162008.BEF9B7F486@heap.f00f.net>
Message-ID: <20050922182713.GA4840@f00f.net>

Ah, I see.  That helps quite a bit.

On Fri, Sep 23, 2005 at 01:19:14AM +0900, GOTO Naohisa wrote:
> Hi,
> I'm one of authors of alignment.rb.
> 
> On Wed, 21 Sep 2005 13:13:02 -0700
> Conan K Woods <email@woodsc.ca> wrote:
> 
> > I noticed that their was a file bio/alignment.rb.  This sounded 
> > interesting(and might be of use for me on another project) and I was 
> > wondering what its for and how its used?  It looked like it could be used 
> > to interface with an alignment program, but I'm not sure how it is used 
> > that way.
> 
> Bio::Alignment class in bio/alignment.rb is a container class
> like Ruby's Hash, Array and BioPerl's Bio::SimpleAlign.
> A very simple example is:
> 
>   require 'bio'
> 
>   seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
>   seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
> 
>   # creates alignment object
>   a = Bio::Alignment.new(seqs)
> 
>   # shows consensus sequence
>   p a.consensus             # ==> "a?gc?"
> 
>   # shows IUPAC consensus
>   p a.consensus_iupac       # ==> "ahgcr"
> 
>   # iterates over each seq
>   a.each { |x| p x }
>     # ==>
>     #    "atgca"
>     #    "aagca"
>     #    "acgca"
>     #    "acgcg"
> 
>   # iterates over each site
>   a.each_site { |x| p x }
>     # ==>
>     #    ["a", "a", "a", "a"]
>     #    ["t", "a", "c", "c"]
>     #    ["g", "g", "g", "g"]
>     #    ["c", "c", "c", "c"]
>     #    ["a", "a", "a", "g"]
> 
>   # doing alignment by using CLUSTAL W.
>   # clustalw command must be installed.
>   factory = Bio::ClustalW.new
>   a2 = a.do_align(factory)
> 
> Note that Bio::Alignment has more methods.
> Becase it has too many methods and it is very complicated,
> I'm planning to do refactoring and splitting its functions into
> some modules. So, specs and usages of methods might be
> changed in the near future.
> 
> -- 
> Naohisa GOTO
> ngoto@gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
From email at woodsc.ca  Thu Sep 22 23:22:01 2005
From: email at woodsc.ca (Conan K Woods)
Date: Thu Sep 22 23:22:38 2005
Subject: [BioRuby] ABI Chromatograms
In-Reply-To: <20050921040502.GE22598@f00f.net>
References: <20050921040502.GE22598@f00f.net>
Message-ID: <20050923032201.GD4840@f00f.net>

Ok, I've got a base implementation.  It loads up the edit sequence,             
title, peak locations and the traces from a given abi file.  It doesn't         
do much else except has a complement method.  If anyone has any ideas of        
what else it needs I can try to add it in.                                      
                                                                                
Not sure what the procedure is for donating code, so I'm just gonna put         
it at http://www.woodsc.ca/code/Abi.ruby.  Feel free to use/not
use/modify as you like.                                                                    
                                                                                
Conan K Woods  

On Tue, Sep 20, 2005 at 09:05:02PM -0700, Conan K Woods wrote:
> Hi, I was wondering if it would be useful to have an object representing
> ABI chromatograms in the bioruby package.  I've coded up an Chromatogram
> object in perl a little while ago for a project I work on, and was
> thinking it might be useful in ruby?
> 
> Unfortunately my knowledge of bioruby package is kind of limited(Its
> hard to find good documentation!) so I'm not sure where it would fit in
> with the rest of the code.  If anyone is interested in this, I'll get
> coding sometime later this week.
> 
> Conan K Woods
> 
> _______________________________________________
> BioRuby mailing list
> BioRuby@open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioruby
From pjotr at pckassa.com  Fri Sep 23 04:40:21 2005
From: pjotr at pckassa.com (Pjotr Prins)
Date: Fri Sep 23 05:08:22 2005
Subject: [BioRuby] alignment.rb
In-Reply-To: <200509221619.j8MGJf99020874@idns103.gen-info.osaka-u.ac.jp>
References: <20050921201302.GG22598@f00f.net>
	<200509221619.j8MGJf99020874@idns103.gen-info.osaka-u.ac.jp>
Message-ID: <20050923084021.GA32558@tm2.nmi-agro.nl>

I added this to the Tutorial.

Again, can I ask everyone - how do we plan to maintain unit tests and
make sure the code in the documentation is *not* broken? The code
should be only *once* in the source repository for a dual purpose -
and I would like to be sure it is correct.

If I were to write something I would like to have these code snippets
in the unit tests and parse them into the rd documentation when
generating documents with rdoc. Is there already something for this in
Ruby space? I know you can do an include in rd, but would need to do
some preparsing to lift only the interesting code out of the unit
test.


Anyway if I were to experiment with this would you welcome it - or
reject it?

Pj.

On Fri, Sep 23, 2005 at 01:19:14AM +0900, GOTO Naohisa wrote:
> Hi,
> I'm one of authors of alignment.rb.
> 
> On Wed, 21 Sep 2005 13:13:02 -0700
> Conan K Woods <email@woodsc.ca> wrote:
> 
> > I noticed that their was a file bio/alignment.rb.  This sounded 
> > interesting(and might be of use for me on another project) and I was 
> > wondering what its for and how its used?  It looked like it could be used 
> > to interface with an alignment program, but I'm not sure how it is used 
> > that way.
> 
> Bio::Alignment class in bio/alignment.rb is a container class
> like Ruby's Hash, Array and BioPerl's Bio::SimpleAlign.
> A very simple example is:
> 
>   require 'bio'
> 
>   seqs = [ 'atgca', 'aagca', 'acgca', 'acgcg' ]
>   seqs = seqs.collect{ |x| Bio::Sequence::NA.new(x) }
> 
>   # creates alignment object
>   a = Bio::Alignment.new(seqs)
> 
>   # shows consensus sequence
>   p a.consensus             # ==> "a?gc?"
> 
>   # shows IUPAC consensus
>   p a.consensus_iupac       # ==> "ahgcr"
> 
>   # iterates over each seq
>   a.each { |x| p x }
>     # ==>
>     #    "atgca"
>     #    "aagca"
>     #    "acgca"
>     #    "acgcg"
> 
>   # iterates over each site
>   a.each_site { |x| p x }
>     # ==>
>     #    ["a", "a", "a", "a"]
>     #    ["t", "a", "c", "c"]
>     #    ["g", "g", "g", "g"]
>     #    ["c", "c", "c", "c"]
>     #    ["a", "a", "a", "g"]
> 
>   # doing alignment by using CLUSTAL W.
>   # clustalw command must be installed.
>   factory = Bio::ClustalW.new
>   a2 = a.do_align(factory)
> 
> Note that Bio::Alignment has more methods.
> Becase it has too many methods and it is very complicated,
> I'm planning to do refactoring and splitting its functions into
> some modules. So, specs and usages of methods might be
> changed in the near future.
> 
> -- 
> Naohisa GOTO
> ngoto@gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
> _______________________________________________
> BioRuby mailing list
> BioRuby@open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioruby
From pjotr at pckassa.com  Sat Sep 24 12:07:45 2005
From: pjotr at pckassa.com (Pjotr Prins)
Date: Sat Sep 24 12:08:01 2005
Subject: [BioRuby] alignment.rb
In-Reply-To: <20050925002000.E87F.NGOTO@gen-info.osaka-u.ac.jp>
References: <200509221619.j8MGJf99020874@idns103.gen-info.osaka-u.ac.jp>
	<20050923084021.GA32558@tm2.nmi-agro.nl>
	<20050925002000.E87F.NGOTO@gen-info.osaka-u.ac.jp>
Message-ID: <20050924160745.GA24247@tm2.nmi-agro.nl>

On Sun, Sep 25, 2005 at 12:51:38AM +0900, Naohisa Goto wrote:
> I think this may be very good for algorithmic or mathematical methods.
> I fear this might not be so good for database parsers, especially
> for complex structured data, because they depend on original
> data's structure and I think we can hardly describe every location in
> original data in every method's unittest where the method's
> return value is based on.

Most of the examples in the tutorial require a simple input string or
file - which can be tested against. For online/db examples I agree
that we should not depend on external tools. Nevertheless it usually
is quite possible to 'fake' something.

Pj.
From ngoto at gen-info.osaka-u.ac.jp  Sat Sep 24 11:51:38 2005
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sat Sep 24 12:08:10 2005
Subject: [BioRuby] alignment.rb
In-Reply-To: <20050923084021.GA32558@tm2.nmi-agro.nl>
References: <200509221619.j8MGJf99020874@idns103.gen-info.osaka-u.ac.jp>
	<20050923084021.GA32558@tm2.nmi-agro.nl>
Message-ID: <20050925002000.E87F.NGOTO@gen-info.osaka-u.ac.jp>

Hi,

> I added this to the Tutorial.

Thanks.
I backported into Japanese Tutorial.rd.ja.
I also made little changes to the Tutorial.rd. Please refer to
cvs commit log. (I'm sorry not updated Wiki yet.)

> Again, can I ask everyone - how do we plan to maintain unit tests and
> make sure the code in the documentation is *not* broken? The code
> should be only *once* in the source repository for a dual purpose -
> and I would like to be sure it is correct.
> 
> If I were to write something I would like to have these code snippets
> in the unit tests and parse them into the rd documentation when
> generating documents with rdoc. Is there already something for this in
> Ruby space? I know you can do an include in rd, but would need to do
> some preparsing to lift only the interesting code out of the unit
> test.
> 
> 
> Anyway if I were to experiment with this would you welcome it - or
> reject it?

I think this may be very good for algorithmic or mathematical methods.
I fear this might not be so good for database parsers, especially
for complex structured data, because they depend on original
data's structure and I think we can hardly describe every location in
original data in every method's unittest where the method's
return value is based on.

-- 
Naohisa GOTO
ngoto@gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan

From ktym at hgc.jp  Sun Sep 25 23:08:01 2005
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Sun Sep 25 23:32:38 2005
Subject: [BioRuby] ABI Chromatograms
In-Reply-To: <20050923032201.GD4840@f00f.net>
References: <20050921040502.GE22598@f00f.net> <20050923032201.GD4840@f00f.net>
Message-ID: <E380D3E1-95A7-4570-9916-5DA4711F7D98@hgc.jp>

Hi Conan,

Thank you for your contributions!

On 2005/09/23, at 12:22, Conan K Woods wrote:
> Not sure what the procedure is for donating code, so I'm just gonna put         
> it at http://www.woodsc.ca/code/Abi.ruby.  Feel free to use/not
> use/modify as you like.                                                                    

I wrote a guideline to contribute your code to the BioRuby project
and commited it in the CVS as a bioruby/README.DEV -- also attached
to this mail.

We would like to include your module, however, could you read this
document and adopt the guideline to your code?

P.S.
Comments on the guideline is also welcome.

Regards,
Toshiaki Katayama


=begin

  $Id: README.DEV,v 1.2 2005/09/26 02:07:02 k Exp $

  Copyright (C) 2005 Toshiaki Katayama <k@bioruby.org>

= How to contribute to the BioRuby project?

There are many possible ways to contribute to the BioRuby project,
such as

* Join the discussion on the BioRuby mailing list
* Send a bug report, write a bug fix patch
* Add and correcting documentations
* Development codes for new features etc.

and all of these are welcome!

However, here we describe on the last option -- how to contribute and
include your codes to the BioRuby distribution.

We would like to include your contribution as long as the scope of
your module meets the field of bioinformatics.

== Licence

If you wish your module to be included in the BioRuby distribution,
you need to agree that your module is licensed under the GNU's LGPL
as the BioRuby chosen LGPL for its license.

== Coding style

You need to follow the typical coding styles of the BioRuby modules:

=== Use the following naming conventions

* CamelCase for module and class names,
* '_'-separated lowercase names for method names,
* '_'-separated lowercase names for variable names, and
* all uppercase names for constants.

=== Indentation must not include tabs

* Use 2 spaces for indentation.
* Don't replace spaces at the line head to tabs.

=== Comments

Don't use =begin and =end block for comments.  If you need to add
comments, include it in the RDoc documentation.

=== Each file must start with the following text

  #
  # bio/foo/bar.rb - foo bar class
  #
  #   Copyright (C) 2005 Ruby B. Hacker <rbh@example.org>
  #
  #  This library is free software; you can redistribute it and/or
  #  modify it under the terms of the GNU Lesser General Public
  #  License as published by the Free Software Foundation; either
  #  version 2 of the License, or (at your option) any later version.
  #
  #  This library is distributed in the hope that it will be useful,
  #  but WITHOUT ANY WARRANTY; without even the implied warranty of
  #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  #  Lesser General Public License for more details.
  #
  #  You should have received a copy of the GNU Lesser General Public
  #  License along with this library; if not, write to the Free Software
  #  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307  USA
  #
  #  $Id:$
  #

=== Documentation should be written in the RDoc format in the source code

The RDoc format is becoming de facto standard for the Ruby documentation.

So, we are now in transition from the RD format to the RDoc format
as we had been written the API documentations in the RD format and
put them at the end of each file.

Additional tutorial documents and some working examples are also
welcome when you contribute your codes.

=== Testing code should use 'test/unit'

Unit test should come with your modules by which you can assure what
you meant to do with each method.  The test code is also able to make
your maintainance procedure easy and stable.

=== Using autoload

To make fast start up, we replaced most of 'require' to 'autoload'
since the BioRuby version 0.7.  During this change, we have found
some tips:

You sholud not separate a same namespace into several files.

* For example, if you have separated definitions of the Bio::Foo
  class in two files (e.g. 'bio/foo.rb' and 'bio/bar.rb'), you
  need to resolve the dependency of them (including the order of
  loading these two files) by yourself.

* It is not the case that you have a definition of Bio::Foo in
  'bio/foo.rb' and a definition of Bio::Foo::Bar in 'bio/bar.rb'.
  In this case, you just need to add following line in the
  'bio/foo.rb' file.

    autoload :Bar, 'bio/foo/bar'
   
You should not put several top level namespaces in one file.

* For example, if you have Bio::A, Bio::B and Bio::C in the file
  'bio/foo.rb', you need

    autoload :A, 'bio/foo'
    autoload :B, 'bio/foo'
    autoload :C, 'bio/foo'

  to load the module automatically (instead of require 'bio/foo').
  In this case, you should put them under the new namespace like
  Bio::Foo::A, Bio::Foo::B and Bio::Foo::C in the file 'bio/foo',
  then use

    autoload :Foo, 'bio/foo'

  to make autoload can be written in 1 line.

== Name space

Your module should be located under the module Bio and put under
the 'bioruby/lib/bio' directory.  The class/module names and the
file names should be short and descriptive.

There are already several sub directories in 'bioruby/lib':

  bio/*.rb   -- general and widely used basic classes
  bio/appl/  -- wrapper and parser for the external applications
  bio/data/  -- basic biological data
  bio/db/    -- flatfile database entry parsers
  bio/io/    -- I/O interfaces for files, RDB, web services etc.
  bio/util/  -- utilities and algorithms for bioinformatics

If your module doesn't match to any of the above, please propose
an appropriate directory name when you contribute.

== Maintainance

Finally, please keep maintain the code your contrubuted.  The BioRuby
staff is willing to give you CVS privilege if needed.

=end