From jan.aerts at bbsrc.ac.uk  Mon Mar  6 09:21:54 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:21:54 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>

Hi all,

Given the posts about bioruby documentation in the last few months, my
own experiences with bioruby and a bit of encouragement from Toshiaki,
I'd like to commence documenting bioruby classes (in CVS) that are not
documented yet, and to standardize the documentation format for those
that already have documentation.

Documentation would take the form of rdoc, so that it would be browsable
via the www.bioruby.org/rdoc website.

Some guidelines that I would like to use in the documentation:
(1) Each class should have a description and synopsis. If there is a
unit test at the bottom, this can easily be tweaked into a synopsis. If
such a unit test is available, 'documentating' would mean (at least in
the first round) 'tweaking and copying the unit test in a comment in
front of the class'. Alternatively, unit tests and documentation could
be combined into one (as Ara and Pjotr discussed), but I'm not
experienced enough in ruby yet to do this in a simple, transparent way.
(2) Given the effort developers have put into writing the classes, it
would be nice if bioruby could reach as wide an audience as possible.
What I believe would help tremendously, is a standardized format for
documentation. By this I mean that the following information is given
for each method (sort of like in bioperl documentation):
    * synopsis
    * description
    * function
    * what it returns
    * any arguments
(3) It should be made clear to the user if a class should be used
directly, or if it just supports other classes (e.g.
Bio::Sequence::Format). Additional important info would be interaction
with other classes (e.g. "how does the sequence class interact with the
embl class?"). Original module writers have an important role in
describing this context.
(4) Encapsule the copyright information between '#--' and '#++', as it
distracts the user from what he/she wants to know. (It _is_ important,
but not for the average user...)


Example of class documentation (from sequence.rb):
# = DESCRIPTION
# The Bio::Sequence class generically describes a nucleic or amino acid
sequence and is a superclass of 
# Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used
on Bio::Sequence objects are described
# in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA
#
# If possible, create sequence objects using the Bio::Sequence::NA or
Bio::Sequence::AA classes instead, as the Bio::Sequence
# class will have to guess the type of sequence you're talking about.
# 
# = SYNOPSIS
#   # Create a nucleic or amino acid sequence
#   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
#   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
#   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
# 
#   # Print it out
#   puts dna.to_s
#   puts aa.to_s
# 
#   # Get a subsequence, bioinformatics style (first nucleotide is '1')
#   puts dna.subseq(2,6)
# 
#   #...more examples from the unit test

Example of method documentation (from sequence.rb):
  # Usage:
  #    my_seq = Bio::Sequence('AGGCACGAT')
  #    my_na = my_seq.na
  # Function::   Converts the Bio::Sequence object into a
Bio::Sequence::NA object
  # Returns::    a Bio::Sequence::NA object
  # Arguments::  none
  def na
    @seq = NA.new(@seq)
    @moltype = NA
  end

As the time I can work on this is only limited, expect to see gradual
additions to the cvs repository. Any other people wishing to help out
are greatly welcome!!

Of course, I promise not to touch other people's code, unless they
explicitely tell me to.

Any thoughts/suggestions on this?

Kind regards,

Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+33 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 09:21:54 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:21:54 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>

Hi all,

Given the posts about bioruby documentation in the last few months, my
own experiences with bioruby and a bit of encouragement from Toshiaki,
I'd like to commence documenting bioruby classes (in CVS) that are not
documented yet, and to standardize the documentation format for those
that already have documentation.

Documentation would take the form of rdoc, so that it would be browsable
via the www.bioruby.org/rdoc website.

Some guidelines that I would like to use in the documentation:
(1) Each class should have a description and synopsis. If there is a
unit test at the bottom, this can easily be tweaked into a synopsis. If
such a unit test is available, 'documentating' would mean (at least in
the first round) 'tweaking and copying the unit test in a comment in
front of the class'. Alternatively, unit tests and documentation could
be combined into one (as Ara and Pjotr discussed), but I'm not
experienced enough in ruby yet to do this in a simple, transparent way.
(2) Given the effort developers have put into writing the classes, it
would be nice if bioruby could reach as wide an audience as possible.
What I believe would help tremendously, is a standardized format for
documentation. By this I mean that the following information is given
for each method (sort of like in bioperl documentation):
    * synopsis
    * description
    * function
    * what it returns
    * any arguments
(3) It should be made clear to the user if a class should be used
directly, or if it just supports other classes (e.g.
Bio::Sequence::Format). Additional important info would be interaction
with other classes (e.g. "how does the sequence class interact with the
embl class?"). Original module writers have an important role in
describing this context.
(4) Encapsule the copyright information between '#--' and '#++', as it
distracts the user from what he/she wants to know. (It _is_ important,
but not for the average user...)


Example of class documentation (from sequence.rb):
# = DESCRIPTION
# The Bio::Sequence class generically describes a nucleic or amino acid
sequence and is a superclass of 
# Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used
on Bio::Sequence objects are described
# in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA
#
# If possible, create sequence objects using the Bio::Sequence::NA or
Bio::Sequence::AA classes instead, as the Bio::Sequence
# class will have to guess the type of sequence you're talking about.
# 
# = SYNOPSIS
#   # Create a nucleic or amino acid sequence
#   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
#   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
#   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
# 
#   # Print it out
#   puts dna.to_s
#   puts aa.to_s
# 
#   # Get a subsequence, bioinformatics style (first nucleotide is '1')
#   puts dna.subseq(2,6)
# 
#   #...more examples from the unit test

Example of method documentation (from sequence.rb):
  # Usage:
  #    my_seq = Bio::Sequence('AGGCACGAT')
  #    my_na = my_seq.na
  # Function::   Converts the Bio::Sequence object into a
Bio::Sequence::NA object
  # Returns::    a Bio::Sequence::NA object
  # Arguments::  none
  def na
    @seq = NA.new(@seq)
    @moltype = NA
  end

As the time I can work on this is only limited, expect to see gradual
additions to the cvs repository. Any other people wishing to help out
are greatly welcome!!

Of course, I promise not to touch other people's code, unless they
explicitely tell me to.

Any thoughts/suggestions on this?

Kind regards,

Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+33 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 09:55:08 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:55:08 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>

Hi Ryan,

(First of all: I think you sent this message to me alone, instead of the
bioruby mailing list....)

Glad to get the documentation discussion started again... The "as a way
of thorougly understanding the use and structure of the classes" sound
familiar...

What do you think of using a standardized or (sound ugly:) formal
format? Does your documentation include some of the
synopsis/description/function/what it returns/arguments things? Do you
think it is useful/feasible to put them in that format?

Thanks,
jan.

> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu] 
> Sent: 06 March 2006 14:41
> To: jan aerts (RI)
> Subject: Re: [BioRuby] bioruby documentation
> 
> Good Morning All,
> 
> I've had similar toughts to Jan, and am a couple methods away 
> from completely documenting Bio::Sequence::* .  I was hoping 
> to send that in to Toshiaki later today.  I haven't yet 
> written a synopsis or description for them, mainly because I 
> was using the process of documenting all the methods as a way 
> of thoroughly understanding the use and structure of the 
> classes.  If the documentation I've currently written is seen 
> as reasonable and accepted, I would then add the overview 
> documentation for those classes and files.
> 
> Is there somewhere we can note which parts different people 
> are working on documenting, so as to avoid any duplication of effort?
> 
> Best!
> 
> -Ryan
> 
> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
> 
> > Hi all,
> >
> > Given the posts about bioruby documentation in the last few 
> months, my 
> > own experiences with bioruby and a bit of encouragement 
> from Toshiaki, 
> > I'd like to commence documenting bioruby classes (in CVS) 
> that are not 
> > documented yet, and to standardize the documentation format 
> for those 
> > that already have documentation.
> >
> > Documentation would take the form of rdoc, so that it would be 
> > browsable via the www.bioruby.org/rdoc website.
> >
> > Some guidelines that I would like to use in the documentation:
> > (1) Each class should have a description and synopsis. If 
> there is a 
> > unit test at the bottom, this can easily be tweaked into a 
> synopsis. 
> > If such a unit test is available, 'documentating' would 
> mean (at least 
> > in the first round) 'tweaking and copying the unit test in 
> a comment 
> > in front of the class'. Alternatively, unit tests and documentation 
> > could be combined into one (as Ara and Pjotr discussed), 
> but I'm not 
> > experienced enough in ruby yet to do this in a simple, 
> transparent way.
> > (2) Given the effort developers have put into writing the 
> classes, it 
> > would be nice if bioruby could reach as wide an audience as 
> possible.
> > What I believe would help tremendously, is a standardized 
> format for 
> > documentation. By this I mean that the following 
> information is given 
> > for each method (sort of like in bioperl documentation):
> >     * synopsis
> >     * description
> >     * function
> >     * what it returns
> >     * any arguments
> > (3) It should be made clear to the user if a class should be used 
> > directly, or if it just supports other classes (e.g.
> > Bio::Sequence::Format). Additional important info would be 
> interaction 
> > with other classes (e.g. "how does the sequence class interact with 
> > the embl class?"). Original module writers have an 
> important role in 
> > describing this context.
> > (4) Encapsule the copyright information between '#--' and 
> '#++', as it 
> > distracts the user from what he/she wants to know. (It _is_ 
> important, 
> > but not for the average user...)
> >
> >
> > Example of class documentation (from sequence.rb):
> > # = DESCRIPTION
> > # The Bio::Sequence class generically describes a nucleic or amino 
> > acid sequence and is a superclass of # Bio::Sequence::NA and 
> > Bio::Sequence::AA. Most methods that can be used on Bio::Sequence 
> > objects are described # in Bio::Sequence::Common, Bio::Sequence::NA 
> > and Bio::Sequence::AA # # If possible, create sequence 
> objects using 
> > the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the 
> > Bio::Sequence # class will have to guess the type of 
> sequence you're 
> > talking about.
> > #
> > # = SYNOPSIS
> > #   # Create a nucleic or amino acid sequence
> > #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
> > #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
> > #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
> > #
> > #   # Print it out
> > #   puts dna.to_s
> > #   puts aa.to_s
> > #
> > #   # Get a subsequence, bioinformatics style (first 
> nucleotide is '1')
> > #   puts dna.subseq(2,6)
> > #
> > #   #...more examples from the unit test
> >
> > Example of method documentation (from sequence.rb):
> >   # Usage:
> >   #    my_seq = Bio::Sequence('AGGCACGAT')
> >   #    my_na = my_seq.na
> >   # Function::   Converts the Bio::Sequence object into a
> > Bio::Sequence::NA object
> >   # Returns::    a Bio::Sequence::NA object
> >   # Arguments::  none
> >   def na
> >     @seq = NA.new(@seq)
> >     @moltype = NA
> >   end
> >
> > As the time I can work on this is only limited, expect to 
> see gradual 
> > additions to the cvs repository. Any other people wishing 
> to help out 
> > are greatly welcome!!
> >
> > Of course, I promise not to touch other people's code, unless they 
> > explicitely tell me to.
> >
> > Any thoughts/suggestions on this?
> >
> > Kind regards,
> >
> > Jan Aerts, PhD
> > Bioinformatics Group
> > Roslin Institute
> > Roslin, Scotland, UK
> > +33 131 527 4200
> >
> > ---------The obligatory disclaimer-------- The information 
> contained 
> > in this e-mail (including any attachments) is
> > confidential and is intended for the use of the addressee 
> only.   The
> > opinions expressed within this e-mail (including any 
> attachments) are 
> > the opinions of the sender and do not necessarily 
> constitute those of 
> > Roslin Institute (Edinburgh) ("the Institute") unless specifically 
> > stated by a sender who is duly authorised to do so on behalf of the 
> > Institute.
> >
> > _______________________________________________
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From rlr215 at nyu.edu  Mon Mar  6 10:14:06 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Mon, 6 Mar 2006 10:14:06 -0500
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu>


Hello again everyone!

>
> What do you think of using a standardized or (sound ugly:) formal
> format? Does your documentation include some of the
> synopsis/description/function/what it returns/arguments things? Do you
> think it is useful/feasible to put them in that format?

I think a reasonable standardization is a good thing, especially at the 
overview level of the class or module or whatever.  Here's an example 
of what I've been writing for method documentation:

(This is for subseq in Bio::Sequence::Common)

   # Returns a new sequence containing the subsequence identified by the 
start
   # and end numbers given as parameters.  *Important:* Biological 
sequence
   # numbering conventions (one-based) rather than ruby's (zero-based) 
numbering
   # conventions are used.
   #
   #   s = Bio::Sequence::Generic.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence 
identical
   # to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   #

So, I haven't been writing enormously formal specs - which seem like a 
bit of overkill for most of the methods, and rdoc takes care of the 
basics of argument lists.  Otherwise I note what to expect in return, 
or if the method does or does not modify the current object.  Also if 
there are any things that are dangerous or tricky...  I also give an 
example for all methods.

It seems to me, and this is surely open to discussion, that formalizing 
the individual method descriptions too much makes them enormously 
tedious to write - so much so that very few will ever get written.  
BUT, on the class or module level, I think a certain amount of 
formalization is good, so that the overviews are reasonably consistent.

Best,

-Ryan

>
> Thanks,
> jan.
>
>> -----Original Message-----
>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>> Sent: 06 March 2006 14:41
>> To: jan aerts (RI)
>> Subject: Re: [BioRuby] bioruby documentation
>>
>> Good Morning All,
>>
>> I've had similar toughts to Jan, and am a couple methods away
>> from completely documenting Bio::Sequence::* .  I was hoping
>> to send that in to Toshiaki later today.  I haven't yet
>> written a synopsis or description for them, mainly because I
>> was using the process of documenting all the methods as a way
>> of thoroughly understanding the use and structure of the
>> classes.  If the documentation I've currently written is seen
>> as reasonable and accepted, I would then add the overview
>> documentation for those classes and files.
>>
>> Is there somewhere we can note which parts different people
>> are working on documenting, so as to avoid any duplication of effort?
>>
>> Best!
>>
>> -Ryan
>>
>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>
>>> Hi all,
>>>
>>> Given the posts about bioruby documentation in the last few
>> months, my
>>> own experiences with bioruby and a bit of encouragement
>> from Toshiaki,
>>> I'd like to commence documenting bioruby classes (in CVS)
>> that are not
>>> documented yet, and to standardize the documentation format
>> for those
>>> that already have documentation.
>>>
>>> Documentation would take the form of rdoc, so that it would be
>>> browsable via the www.bioruby.org/rdoc website.
>>>
>>> Some guidelines that I would like to use in the documentation:
>>> (1) Each class should have a description and synopsis. If
>> there is a
>>> unit test at the bottom, this can easily be tweaked into a
>> synopsis.
>>> If such a unit test is available, 'documentating' would
>> mean (at least
>>> in the first round) 'tweaking and copying the unit test in
>> a comment
>>> in front of the class'. Alternatively, unit tests and documentation
>>> could be combined into one (as Ara and Pjotr discussed),
>> but I'm not
>>> experienced enough in ruby yet to do this in a simple,
>> transparent way.
>>> (2) Given the effort developers have put into writing the
>> classes, it
>>> would be nice if bioruby could reach as wide an audience as
>> possible.
>>> What I believe would help tremendously, is a standardized
>> format for
>>> documentation. By this I mean that the following
>> information is given
>>> for each method (sort of like in bioperl documentation):
>>>     * synopsis
>>>     * description
>>>     * function
>>>     * what it returns
>>>     * any arguments
>>> (3) It should be made clear to the user if a class should be used
>>> directly, or if it just supports other classes (e.g.
>>> Bio::Sequence::Format). Additional important info would be
>> interaction
>>> with other classes (e.g. "how does the sequence class interact with
>>> the embl class?"). Original module writers have an
>> important role in
>>> describing this context.
>>> (4) Encapsule the copyright information between '#--' and
>> '#++', as it
>>> distracts the user from what he/she wants to know. (It _is_
>> important,
>>> but not for the average user...)
>>>
>>>
>>> Example of class documentation (from sequence.rb):
>>> # = DESCRIPTION
>>> # The Bio::Sequence class generically describes a nucleic or amino
>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>> and Bio::Sequence::AA # # If possible, create sequence
>> objects using
>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>> Bio::Sequence # class will have to guess the type of
>> sequence you're
>>> talking about.
>>> #
>>> # = SYNOPSIS
>>> #   # Create a nucleic or amino acid sequence
>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>> #
>>> #   # Print it out
>>> #   puts dna.to_s
>>> #   puts aa.to_s
>>> #
>>> #   # Get a subsequence, bioinformatics style (first
>> nucleotide is '1')
>>> #   puts dna.subseq(2,6)
>>> #
>>> #   #...more examples from the unit test
>>>
>>> Example of method documentation (from sequence.rb):
>>>   # Usage:
>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>   #    my_na = my_seq.na
>>>   # Function::   Converts the Bio::Sequence object into a
>>> Bio::Sequence::NA object
>>>   # Returns::    a Bio::Sequence::NA object
>>>   # Arguments::  none
>>>   def na
>>>     @seq = NA.new(@seq)
>>>     @moltype = NA
>>>   end
>>>
>>> As the time I can work on this is only limited, expect to
>> see gradual
>>> additions to the cvs repository. Any other people wishing
>> to help out
>>> are greatly welcome!!
>>>
>>> Of course, I promise not to touch other people's code, unless they
>>> explicitely tell me to.
>>>
>>> Any thoughts/suggestions on this?
>>>
>>> Kind regards,
>>>
>>> Jan Aerts, PhD
>>> Bioinformatics Group
>>> Roslin Institute
>>> Roslin, Scotland, UK
>>> +33 131 527 4200
>>>
>>> ---------The obligatory disclaimer-------- The information
>> contained
>>> in this e-mail (including any attachments) is
>>> confidential and is intended for the use of the addressee
>> only.   The
>>> opinions expressed within this e-mail (including any
>> attachments) are
>>> the opinions of the sender and do not necessarily
>> constitute those of
>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>> stated by a sender who is duly authorised to do so on behalf of the
>>> Institute.
>>>
>>> _______________________________________________
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>>


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 13:21:53 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 18:21:53 -0000
Subject: [BioRuby] bioruby documentation
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>

Ryan,

Nice piece of doc. I completely agree that the level of formalization is entirely open to discussion. And I completely understand your concerns. But on the other hand, a formalized list of things to be described can, in my opinion, _help_ developers document their code, rather than it would keep them from doing that. You can see it as a checklist of things to document. In your piece of code, you describe several aspects of the subseq method, but for every new method you'd describe, you'd need to have this list of things in the back of your head that you have to mention ("did I mention that it returns itself?" "did I mention what the defaults for the arguments are", ...). If we would have this list accessible on the wiki for any developer, he/she could copy it into their code and fill it in like a checklist. I suspect that would make things much easier on the developer (but that's my own view, of course).

You're right that rdoc already takes care of argument lists, but it only lists them, instead of describing them. And in many instances, a bioruby user would have to know what the arguments actually are (including their defaults) without going into the code. Ergo: arguments should be documented.

What do you think?
jan


-----Original Message-----
From: Ryan Raaum [mailto:rlr215 at nyu.edu]
Sent: Mon 3/6/2006 3:14 PM
To: jan aerts (RI)
Cc: bioruby at open-bio.org
Subject: Re: [BioRuby] bioruby documentation
 

Hello again everyone!

>
> What do you think of using a standardized or (sound ugly:) formal
> format? Does your documentation include some of the
> synopsis/description/function/what it returns/arguments things? Do you
> think it is useful/feasible to put them in that format?

I think a reasonable standardization is a good thing, especially at the 
overview level of the class or module or whatever.  Here's an example 
of what I've been writing for method documentation:

(This is for subseq in Bio::Sequence::Common)

   # Returns a new sequence containing the subsequence identified by the 
start
   # and end numbers given as parameters.  *Important:* Biological 
sequence
   # numbering conventions (one-based) rather than ruby's (zero-based) 
numbering
   # conventions are used.
   #
   #   s = Bio::Sequence::Generic.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence 
identical
   # to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   #

So, I haven't been writing enormously formal specs - which seem like a 
bit of overkill for most of the methods, and rdoc takes care of the 
basics of argument lists.  Otherwise I note what to expect in return, 
or if the method does or does not modify the current object.  Also if 
there are any things that are dangerous or tricky...  I also give an 
example for all methods.

It seems to me, and this is surely open to discussion, that formalizing 
the individual method descriptions too much makes them enormously 
tedious to write - so much so that very few will ever get written.  
BUT, on the class or module level, I think a certain amount of 
formalization is good, so that the overviews are reasonably consistent.

Best,

-Ryan

>
> Thanks,
> jan.
>
>> -----Original Message-----
>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>> Sent: 06 March 2006 14:41
>> To: jan aerts (RI)
>> Subject: Re: [BioRuby] bioruby documentation
>>
>> Good Morning All,
>>
>> I've had similar toughts to Jan, and am a couple methods away
>> from completely documenting Bio::Sequence::* .  I was hoping
>> to send that in to Toshiaki later today.  I haven't yet
>> written a synopsis or description for them, mainly because I
>> was using the process of documenting all the methods as a way
>> of thoroughly understanding the use and structure of the
>> classes.  If the documentation I've currently written is seen
>> as reasonable and accepted, I would then add the overview
>> documentation for those classes and files.
>>
>> Is there somewhere we can note which parts different people
>> are working on documenting, so as to avoid any duplication of effort?
>>
>> Best!
>>
>> -Ryan
>>
>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>
>>> Hi all,
>>>
>>> Given the posts about bioruby documentation in the last few
>> months, my
>>> own experiences with bioruby and a bit of encouragement
>> from Toshiaki,
>>> I'd like to commence documenting bioruby classes (in CVS)
>> that are not
>>> documented yet, and to standardize the documentation format
>> for those
>>> that already have documentation.
>>>
>>> Documentation would take the form of rdoc, so that it would be
>>> browsable via the www.bioruby.org/rdoc website.
>>>
>>> Some guidelines that I would like to use in the documentation:
>>> (1) Each class should have a description and synopsis. If
>> there is a
>>> unit test at the bottom, this can easily be tweaked into a
>> synopsis.
>>> If such a unit test is available, 'documentating' would
>> mean (at least
>>> in the first round) 'tweaking and copying the unit test in
>> a comment
>>> in front of the class'. Alternatively, unit tests and documentation
>>> could be combined into one (as Ara and Pjotr discussed),
>> but I'm not
>>> experienced enough in ruby yet to do this in a simple,
>> transparent way.
>>> (2) Given the effort developers have put into writing the
>> classes, it
>>> would be nice if bioruby could reach as wide an audience as
>> possible.
>>> What I believe would help tremendously, is a standardized
>> format for
>>> documentation. By this I mean that the following
>> information is given
>>> for each method (sort of like in bioperl documentation):
>>>     * synopsis
>>>     * description
>>>     * function
>>>     * what it returns
>>>     * any arguments
>>> (3) It should be made clear to the user if a class should be used
>>> directly, or if it just supports other classes (e.g.
>>> Bio::Sequence::Format). Additional important info would be
>> interaction
>>> with other classes (e.g. "how does the sequence class interact with
>>> the embl class?"). Original module writers have an
>> important role in
>>> describing this context.
>>> (4) Encapsule the copyright information between '#--' and
>> '#++', as it
>>> distracts the user from what he/she wants to know. (It _is_
>> important,
>>> but not for the average user...)
>>>
>>>
>>> Example of class documentation (from sequence.rb):
>>> # = DESCRIPTION
>>> # The Bio::Sequence class generically describes a nucleic or amino
>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>> and Bio::Sequence::AA # # If possible, create sequence
>> objects using
>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>> Bio::Sequence # class will have to guess the type of
>> sequence you're
>>> talking about.
>>> #
>>> # = SYNOPSIS
>>> #   # Create a nucleic or amino acid sequence
>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>> #
>>> #   # Print it out
>>> #   puts dna.to_s
>>> #   puts aa.to_s
>>> #
>>> #   # Get a subsequence, bioinformatics style (first
>> nucleotide is '1')
>>> #   puts dna.subseq(2,6)
>>> #
>>> #   #...more examples from the unit test
>>>
>>> Example of method documentation (from sequence.rb):
>>>   # Usage:
>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>   #    my_na = my_seq.na
>>>   # Function::   Converts the Bio::Sequence object into a
>>> Bio::Sequence::NA object
>>>   # Returns::    a Bio::Sequence::NA object
>>>   # Arguments::  none
>>>   def na
>>>     @seq = NA.new(@seq)
>>>     @moltype = NA
>>>   end
>>>
>>> As the time I can work on this is only limited, expect to
>> see gradual
>>> additions to the cvs repository. Any other people wishing
>> to help out
>>> are greatly welcome!!
>>>
>>> Of course, I promise not to touch other people's code, unless they
>>> explicitely tell me to.
>>>
>>> Any thoughts/suggestions on this?
>>>
>>> Kind regards,
>>>
>>> Jan Aerts, PhD
>>> Bioinformatics Group
>>> Roslin Institute
>>> Roslin, Scotland, UK
>>> +33 131 527 4200
>>>
>>> ---------The obligatory disclaimer-------- The information
>> contained
>>> in this e-mail (including any attachments) is
>>> confidential and is intended for the use of the addressee
>> only.   The
>>> opinions expressed within this e-mail (including any
>> attachments) are
>>> the opinions of the sender and do not necessarily
>> constitute those of
>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>> stated by a sender who is duly authorised to do so on behalf of the
>>> Institute.
>>>
>>> _______________________________________________
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>>


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 13:41:52 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 18:41:52 -0000
Subject: [BioRuby] bioruby documentation
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
	<440C7A13.5070307@corevx.com>
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6615@rie2ksrv1.ri.bbsrc.ac.uk>

Hi Trevor,

I agree that, if at all, only public methods should be documented. And, as you say, one or two lines of comment become commonplace. However, we have to keep in mind that it's the end-user that is the target for the docs. Myself having used BioPerl a lot, I found the included method-docs almost always sufficient for using them. The fact that some BioPerl developers did not adequately supply information (be it in that formal format) probably means that they would not provide documentation at all if not for that standard.

If the consensus would be _not_ to document the methods, I'll of course go for that. What do the heavy-weights think?
jan.


-----Original Message-----
From: Trevor Wennblom [mailto:trevor at corevx.com]
Sent: Mon 3/6/2006 6:06 PM
To: jan aerts (RI); bioruby at open-bio.org
Subject: Re: [BioRuby] bioruby documentation
 
jan aerts (RI) wrote:
> (2) Given the effort developers have put into writing the classes, it
> would be nice if bioruby could reach as wide an audience as possible.
> What I believe would help tremendously, is a standardized format for
> documentation. By this I mean that the following information is given
> for each method (sort of like in bioperl documentation):
>     * synopsis
>     * description
>     * function
>     * what it returns
>     * any arguments
>   

Hi Jan,

Thanks for taking the initiative on this important subject!  Coming up 
with a standard for documenting the major classes and modules would be a 
great idea, I've tried my best on the components that I've written so far.

I'm going to agree with Ryan that documenting every method is likely 
overkill.  One of the beauties of Ruby is that one or two line methods 
become commonplace.  Often to read BioPerl code (where they do generally 
have every method formally documented) I strip out the comments since 
they dominate the code to such a degree as to be distracting, and the 
comments are often just there to meet spec but not provide useful 
information.  If we were to require documentation of methods I would say 
that it should only be required for public methods.


> (4) Encapsule the copyright information between '#--' and '#++', as it
> distracts the user from what he/she wants to know. (It _is_ important,
> but not for the average user...)
>   

We're switching to the Ruby license, correct?  Do we even need anything 
beyond "License:: Ruby"?

Thanks again,
Trevor


From rlr215 at nyu.edu  Mon Mar  6 13:46:12 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Mon, 6 Mar 2006 13:46:12 -0500
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <4db8cd9e5949c71fdba3ec26ffa389dd@nyu.edu>

Hi all (again!),

Putting the formalization into a more concrete perspective, compare:

an example from the bioperl docs:
http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html

and an example from the Ruby on Rails docs:
http://api.rubyonrails.org/classes/ActionController/Base.html

The bioperl example is very formalized, so it is true that nothing is 
left out.  However, it doesn't read very well and most of the method 
documentation ends up being highly repetitive:  (To caricature... :)

Title   : do_something
Usage   : Object.do_something
Function: does something
Returns : something
Args    : precursor to something

Whereas (in my mind), the rails documentation reads very well, simple 
methods are simply documented, complex methods are documented in 
detail.  If the arguments are absent or obvious, don't talk about them; 
if the arguments are tricky, do talk about them. And so on.  No one 
really *wants* to document, and if documenting is annoying (= overly 
formalized), no one will.

I think a consistent, relatively formalized overview is good, but that 
overly formalized method and attribute documentation guidelines 
ultimately mean that little to no documentation will get done because 
it's too annoying (in most real-world open source projects).

Best,

Ryan

On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote:

> Ryan,
>
> Nice piece of doc. I completely agree that the level of formalization 
> is entirely open to discussion. And I completely understand your 
> concerns. But on the other hand, a formalized list of things to be 
> described can, in my opinion, _help_ developers document their code, 
> rather than it would keep them from doing that. You can see it as a 
> checklist of things to document. In your piece of code, you describe 
> several aspects of the subseq method, but for every new method you'd 
> describe, you'd need to have this list of things in the back of your 
> head that you have to mention ("did I mention that it returns itself?" 
> "did I mention what the defaults for the arguments are", ...). If we 
> would have this list accessible on the wiki for any developer, he/she 
> could copy it into their code and fill it in like a checklist. I 
> suspect that would make things much easier on the developer (but 
> that's my own view, of course).
>
> You're right that rdoc already takes care of argument lists, but it 
> only lists them, instead of describing them. And in many instances, a 
> bioruby user would have to know what the arguments actually are 
> (including their defaults) without going into the code. Ergo: 
> arguments should be documented.
>
> What do you think?
> jan
>
>
> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> Sent: Mon 3/6/2006 3:14 PM
> To: jan aerts (RI)
> Cc: bioruby at open-bio.org
> Subject: Re: [BioRuby] bioruby documentation
>
>
> Hello again everyone!
>
>>
>> What do you think of using a standardized or (sound ugly:) formal
>> format? Does your documentation include some of the
>> synopsis/description/function/what it returns/arguments things? Do you
>> think it is useful/feasible to put them in that format?
>
> I think a reasonable standardization is a good thing, especially at the
> overview level of the class or module or whatever.  Here's an example
> of what I've been writing for method documentation:
>
> (This is for subseq in Bio::Sequence::Common)
>
>    # Returns a new sequence containing the subsequence identified by 
> the
> start
>    # and end numbers given as parameters.  *Important:* Biological
> sequence
>    # numbering conventions (one-based) rather than ruby's (zero-based)
> numbering
>    # conventions are used.
>    #
>    #   s = Bio::Sequence::Generic.new('atggaatga')
>    #   puts s.subseq(1,3)                      #=> "atg"
>    #
>    # Start defaults to 1 and end defaults to the entire existing 
> string,
> so
>    # subseq called without any parameters simply returns a new sequence
> identical
>    # to the existing sequence.
>    #
>    #   puts s.subseq                           #=> "atggaatga"
>    #
>
> So, I haven't been writing enormously formal specs - which seem like a
> bit of overkill for most of the methods, and rdoc takes care of the
> basics of argument lists.  Otherwise I note what to expect in return,
> or if the method does or does not modify the current object.  Also if
> there are any things that are dangerous or tricky...  I also give an
> example for all methods.
>
> It seems to me, and this is surely open to discussion, that formalizing
> the individual method descriptions too much makes them enormously
> tedious to write - so much so that very few will ever get written.
> BUT, on the class or module level, I think a certain amount of
> formalization is good, so that the overviews are reasonably consistent.
>
> Best,
>
> -Ryan
>
>>
>> Thanks,
>> jan.
>>
>>> -----Original Message-----
>>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>>> Sent: 06 March 2006 14:41
>>> To: jan aerts (RI)
>>> Subject: Re: [BioRuby] bioruby documentation
>>>
>>> Good Morning All,
>>>
>>> I've had similar toughts to Jan, and am a couple methods away
>>> from completely documenting Bio::Sequence::* .  I was hoping
>>> to send that in to Toshiaki later today.  I haven't yet
>>> written a synopsis or description for them, mainly because I
>>> was using the process of documenting all the methods as a way
>>> of thoroughly understanding the use and structure of the
>>> classes.  If the documentation I've currently written is seen
>>> as reasonable and accepted, I would then add the overview
>>> documentation for those classes and files.
>>>
>>> Is there somewhere we can note which parts different people
>>> are working on documenting, so as to avoid any duplication of effort?
>>>
>>> Best!
>>>
>>> -Ryan
>>>
>>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>>
>>>> Hi all,
>>>>
>>>> Given the posts about bioruby documentation in the last few
>>> months, my
>>>> own experiences with bioruby and a bit of encouragement
>>> from Toshiaki,
>>>> I'd like to commence documenting bioruby classes (in CVS)
>>> that are not
>>>> documented yet, and to standardize the documentation format
>>> for those
>>>> that already have documentation.
>>>>
>>>> Documentation would take the form of rdoc, so that it would be
>>>> browsable via the www.bioruby.org/rdoc website.
>>>>
>>>> Some guidelines that I would like to use in the documentation:
>>>> (1) Each class should have a description and synopsis. If
>>> there is a
>>>> unit test at the bottom, this can easily be tweaked into a
>>> synopsis.
>>>> If such a unit test is available, 'documentating' would
>>> mean (at least
>>>> in the first round) 'tweaking and copying the unit test in
>>> a comment
>>>> in front of the class'. Alternatively, unit tests and documentation
>>>> could be combined into one (as Ara and Pjotr discussed),
>>> but I'm not
>>>> experienced enough in ruby yet to do this in a simple,
>>> transparent way.
>>>> (2) Given the effort developers have put into writing the
>>> classes, it
>>>> would be nice if bioruby could reach as wide an audience as
>>> possible.
>>>> What I believe would help tremendously, is a standardized
>>> format for
>>>> documentation. By this I mean that the following
>>> information is given
>>>> for each method (sort of like in bioperl documentation):
>>>>     * synopsis
>>>>     * description
>>>>     * function
>>>>     * what it returns
>>>>     * any arguments
>>>> (3) It should be made clear to the user if a class should be used
>>>> directly, or if it just supports other classes (e.g.
>>>> Bio::Sequence::Format). Additional important info would be
>>> interaction
>>>> with other classes (e.g. "how does the sequence class interact with
>>>> the embl class?"). Original module writers have an
>>> important role in
>>>> describing this context.
>>>> (4) Encapsule the copyright information between '#--' and
>>> '#++', as it
>>>> distracts the user from what he/she wants to know. (It _is_
>>> important,
>>>> but not for the average user...)
>>>>
>>>>
>>>> Example of class documentation (from sequence.rb):
>>>> # = DESCRIPTION
>>>> # The Bio::Sequence class generically describes a nucleic or amino
>>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>>> and Bio::Sequence::AA # # If possible, create sequence
>>> objects using
>>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>>> Bio::Sequence # class will have to guess the type of
>>> sequence you're
>>>> talking about.
>>>> #
>>>> # = SYNOPSIS
>>>> #   # Create a nucleic or amino acid sequence
>>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>>> #
>>>> #   # Print it out
>>>> #   puts dna.to_s
>>>> #   puts aa.to_s
>>>> #
>>>> #   # Get a subsequence, bioinformatics style (first
>>> nucleotide is '1')
>>>> #   puts dna.subseq(2,6)
>>>> #
>>>> #   #...more examples from the unit test
>>>>
>>>> Example of method documentation (from sequence.rb):
>>>>   # Usage:
>>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>>   #    my_na = my_seq.na
>>>>   # Function::   Converts the Bio::Sequence object into a
>>>> Bio::Sequence::NA object
>>>>   # Returns::    a Bio::Sequence::NA object
>>>>   # Arguments::  none
>>>>   def na
>>>>     @seq = NA.new(@seq)
>>>>     @moltype = NA
>>>>   end
>>>>
>>>> As the time I can work on this is only limited, expect to
>>> see gradual
>>>> additions to the cvs repository. Any other people wishing
>>> to help out
>>>> are greatly welcome!!
>>>>
>>>> Of course, I promise not to touch other people's code, unless they
>>>> explicitely tell me to.
>>>>
>>>> Any thoughts/suggestions on this?
>>>>
>>>> Kind regards,
>>>>
>>>> Jan Aerts, PhD
>>>> Bioinformatics Group
>>>> Roslin Institute
>>>> Roslin, Scotland, UK
>>>> +33 131 527 4200
>>>>
>>>> ---------The obligatory disclaimer-------- The information
>>> contained
>>>> in this e-mail (including any attachments) is
>>>> confidential and is intended for the use of the addressee
>>> only.   The
>>>> opinions expressed within this e-mail (including any
>>> attachments) are
>>>> the opinions of the sender and do not necessarily
>>> constitute those of
>>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>>> stated by a sender who is duly authorised to do so on behalf of the
>>>> Institute.
>>>>
>>>> _______________________________________________
>>>> BioRuby mailing list
>>>> BioRuby at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>>
>>>
>
>


From trevor at corevx.com  Mon Mar  6 13:06:11 2006
From: trevor at corevx.com (Trevor Wennblom)
Date: Mon, 06 Mar 2006 12:06:11 -0600
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <440C7A13.5070307@corevx.com>

jan aerts (RI) wrote:
> (2) Given the effort developers have put into writing the classes, it
> would be nice if bioruby could reach as wide an audience as possible.
> What I believe would help tremendously, is a standardized format for
> documentation. By this I mean that the following information is given
> for each method (sort of like in bioperl documentation):
>     * synopsis
>     * description
>     * function
>     * what it returns
>     * any arguments
>   

Hi Jan,

Thanks for taking the initiative on this important subject!  Coming up 
with a standard for documenting the major classes and modules would be a 
great idea, I've tried my best on the components that I've written so far.

I'm going to agree with Ryan that documenting every method is likely 
overkill.  One of the beauties of Ruby is that one or two line methods 
become commonplace.  Often to read BioPerl code (where they do generally 
have every method formally documented) I strip out the comments since 
they dominate the code to such a degree as to be distracting, and the 
comments are often just there to meet spec but not provide useful 
information.  If we were to require documentation of methods I would say 
that it should only be required for public methods.


> (4) Encapsule the copyright information between '#--' and '#++', as it
> distracts the user from what he/she wants to know. (It _is_ important,
> but not for the average user...)
>   

We're switching to the Ruby license, correct?  Do we even need anything 
beyond "License:: Ruby"?

Thanks again,
Trevor


From ktym at hgc.jp  Tue Mar  7 00:57:44 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 7 Mar 2006 14:57:44 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <440C7A13.5070307@corevx.com>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
	<440C7A13.5070307@corevx.com>
Message-ID: <EA0FD850-546B-400A-8D77-957DB9DCEED5@hgc.jp>

Hi,

Thanks for a lot of discussions.

* format of RDoc and level of detail - still need discussion

For readability in terminal, please fold docs within 79 columns
(example code in doc would break this principle).
Please use "#" prefixed style and don't use =begin rdoc/=end pairs
as it makes impossible to read the code without coloring.

Basically agreed to have standardized format as Jan suggested.
It will make clear what should be documented at least,
especially for the non-native developers.

Also agreed with Ryan's comparison

  - standardized format can be repetitive
  - simple methods are simply documented, complex methods
    are documented in detail

It looks ideal to have adequate dose of documentation and
it will also require some writing skill.  I'm really happy
if some of you could lead to fill BioRuby with nice level
of documentation.

* license - please change to Ruby's

Core Japanese developers are agreed to change license from LGPL to Ruby's
to make everyone who use Ruby can use BioRuby (re-writing of header is
not yet completed in some modules, though).

We need to ask other contributors to follow this change - ask their
permission that we can change the license whenever BioRuby staff needs.

* where and how to include data (enzyme.yaml for REBASE)

This is under discussion with Trevor but I think all discussions should
be done on this list to have audience (to tell the truth, reading/writing
English mails require time for me, so wants to share them without posting
summary in addition:).

Toshiaki


From jan.aerts at bbsrc.ac.uk  Tue Mar  7 06:01:26 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Tue, 7 Mar 2006 11:01:26 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD4@rie2ksrv1.ri.bbsrc.ac.uk>

Good morning again.

If I understand correctly, the general feeling is that the class-level
docs should have a bit of standardized bit in them (a description, an
example) and that method-level docs should not be too elaborate if
that's not necessary. How about (given Toshiaki's comment of "[making]
clear what should be documented at least, especially for the non-native
developers") for _non-trivial_ methods giving a description, an example,
and the type of thing it returns? (These could or could not be nicely
put under separate headers; con: can look bloated, pro: speeds up
browsing if you want to know what a method returns). I think the
Bio::Sequence::Common#window_search is a nice example: tells you what
it's meant to do, gives an example, and says what it returns.

So what do you think of the following:
* standardized parts for class-level docs: description, example, and if
necessary: relationship to other classes
* for complex methods: use Bio::Sequence::Common#window_search as an
example (with or without little title thingies)
* for simple methods: use simple methods of the Rails
ActionController::Base as an example: just a one-line description

As rdoc takes care of listing arguments to a class: is there a way to
let it show automatically if an argument is mandatory or not?

jan.


> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu] 
> Sent: 06 March 2006 18:46
> To: jan aerts (RI)
> Cc: bioruby at open-bio.org
> Subject: Re: [BioRuby] bioruby documentation
> 
> Hi all (again!),
> 
> Putting the formalization into a more concrete perspective, compare:
> 
> an example from the bioperl docs:
> http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html
> 
> and an example from the Ruby on Rails docs:
> http://api.rubyonrails.org/classes/ActionController/Base.html
> 
> The bioperl example is very formalized, so it is true that 
> nothing is left out.  However, it doesn't read very well and 
> most of the method documentation ends up being highly 
> repetitive:  (To caricature... :)
> 
> Title   : do_something
> Usage   : Object.do_something
> Function: does something
> Returns : something
> Args    : precursor to something
> 
> Whereas (in my mind), the rails documentation reads very 
> well, simple methods are simply documented, complex methods 
> are documented in detail.  If the arguments are absent or 
> obvious, don't talk about them; if the arguments are tricky, 
> do talk about them. And so on.  No one really *wants* to 
> document, and if documenting is annoying (= overly 
> formalized), no one will.
> 
> I think a consistent, relatively formalized overview is good, 
> but that overly formalized method and attribute documentation 
> guidelines ultimately mean that little to no documentation 
> will get done because it's too annoying (in most real-world 
> open source projects).
> 
> Best,
> 
> Ryan
> 
> On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote:
> 
> > Ryan,
> >
> > Nice piece of doc. I completely agree that the level of 
> formalization 
> > is entirely open to discussion. And I completely understand your 
> > concerns. But on the other hand, a formalized list of things to be 
> > described can, in my opinion, _help_ developers document 
> their code, 
> > rather than it would keep them from doing that. You can see it as a 
> > checklist of things to document. In your piece of code, you 
> describe 
> > several aspects of the subseq method, but for every new 
> method you'd 
> > describe, you'd need to have this list of things in the 
> back of your 
> > head that you have to mention ("did I mention that it 
> returns itself?"
> > "did I mention what the defaults for the arguments are", 
> ...). If we 
> > would have this list accessible on the wiki for any 
> developer, he/she 
> > could copy it into their code and fill it in like a checklist. I 
> > suspect that would make things much easier on the developer (but 
> > that's my own view, of course).
> >
> > You're right that rdoc already takes care of argument lists, but it 
> > only lists them, instead of describing them. And in many 
> instances, a 
> > bioruby user would have to know what the arguments actually are 
> > (including their defaults) without going into the code. Ergo:
> > arguments should be documented.
> >
> > What do you think?
> > jan
> >
> >
> > -----Original Message-----
> > From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> > Sent: Mon 3/6/2006 3:14 PM
> > To: jan aerts (RI)
> > Cc: bioruby at open-bio.org
> > Subject: Re: [BioRuby] bioruby documentation
> >
> >
> > Hello again everyone!
> >
> >>
> >> What do you think of using a standardized or (sound ugly:) formal 
> >> format? Does your documentation include some of the 
> >> synopsis/description/function/what it returns/arguments things? Do 
> >> you think it is useful/feasible to put them in that format?
> >
> > I think a reasonable standardization is a good thing, especially at 
> > the overview level of the class or module or whatever.  Here's an 
> > example of what I've been writing for method documentation:
> >
> > (This is for subseq in Bio::Sequence::Common)
> >
> >    # Returns a new sequence containing the subsequence 
> identified by 
> > the start
> >    # and end numbers given as parameters.  *Important:* Biological 
> > sequence
> >    # numbering conventions (one-based) rather than ruby's 
> (zero-based) 
> > numbering
> >    # conventions are used.
> >    #
> >    #   s = Bio::Sequence::Generic.new('atggaatga')
> >    #   puts s.subseq(1,3)                      #=> "atg"
> >    #
> >    # Start defaults to 1 and end defaults to the entire existing 
> > string, so
> >    # subseq called without any parameters simply returns a new 
> > sequence identical
> >    # to the existing sequence.
> >    #
> >    #   puts s.subseq                           #=> "atggaatga"
> >    #
> >
> > So, I haven't been writing enormously formal specs - which 
> seem like a 
> > bit of overkill for most of the methods, and rdoc takes care of the 
> > basics of argument lists.  Otherwise I note what to expect 
> in return, 
> > or if the method does or does not modify the current 
> object.  Also if 
> > there are any things that are dangerous or tricky...  I 
> also give an 
> > example for all methods.
> >
> > It seems to me, and this is surely open to discussion, that 
> > formalizing the individual method descriptions too much makes them 
> > enormously tedious to write - so much so that very few will 
> ever get written.
> > BUT, on the class or module level, I think a certain amount of 
> > formalization is good, so that the overviews are reasonably 
> consistent.
> >
> > Best,
> >
> > -Ryan
> >
> >>
> >> Thanks,
> >> jan.
> >>
> >>> -----Original Message-----
> >>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> >>> Sent: 06 March 2006 14:41
> >>> To: jan aerts (RI)
> >>> Subject: Re: [BioRuby] bioruby documentation
> >>>
> >>> Good Morning All,
> >>>
> >>> I've had similar toughts to Jan, and am a couple methods 
> away from 
> >>> completely documenting Bio::Sequence::* .  I was hoping 
> to send that 
> >>> in to Toshiaki later today.  I haven't yet written a synopsis or 
> >>> description for them, mainly because I was using the process of 
> >>> documenting all the methods as a way of thoroughly 
> understanding the 
> >>> use and structure of the classes.  If the documentation I've 
> >>> currently written is seen as reasonable and accepted, I 
> would then 
> >>> add the overview documentation for those classes and files.
> >>>
> >>> Is there somewhere we can note which parts different people are 
> >>> working on documenting, so as to avoid any duplication of effort?
> >>>
> >>> Best!
> >>>
> >>> -Ryan
> >>>
> >>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> Given the posts about bioruby documentation in the last few
> >>> months, my
> >>>> own experiences with bioruby and a bit of encouragement
> >>> from Toshiaki,
> >>>> I'd like to commence documenting bioruby classes (in CVS)
> >>> that are not
> >>>> documented yet, and to standardize the documentation format
> >>> for those
> >>>> that already have documentation.
> >>>>
> >>>> Documentation would take the form of rdoc, so that it would be 
> >>>> browsable via the www.bioruby.org/rdoc website.
> >>>>
> >>>> Some guidelines that I would like to use in the documentation:
> >>>> (1) Each class should have a description and synopsis. If
> >>> there is a
> >>>> unit test at the bottom, this can easily be tweaked into a
> >>> synopsis.
> >>>> If such a unit test is available, 'documentating' would
> >>> mean (at least
> >>>> in the first round) 'tweaking and copying the unit test in
> >>> a comment
> >>>> in front of the class'. Alternatively, unit tests and 
> documentation 
> >>>> could be combined into one (as Ara and Pjotr discussed),
> >>> but I'm not
> >>>> experienced enough in ruby yet to do this in a simple,
> >>> transparent way.
> >>>> (2) Given the effort developers have put into writing the
> >>> classes, it
> >>>> would be nice if bioruby could reach as wide an audience as
> >>> possible.
> >>>> What I believe would help tremendously, is a standardized
> >>> format for
> >>>> documentation. By this I mean that the following
> >>> information is given
> >>>> for each method (sort of like in bioperl documentation):
> >>>>     * synopsis
> >>>>     * description
> >>>>     * function
> >>>>     * what it returns
> >>>>     * any arguments
> >>>> (3) It should be made clear to the user if a class 
> should be used 
> >>>> directly, or if it just supports other classes (e.g.
> >>>> Bio::Sequence::Format). Additional important info would be
> >>> interaction
> >>>> with other classes (e.g. "how does the sequence class 
> interact with 
> >>>> the embl class?"). Original module writers have an
> >>> important role in
> >>>> describing this context.
> >>>> (4) Encapsule the copyright information between '#--' and
> >>> '#++', as it
> >>>> distracts the user from what he/she wants to know. (It _is_
> >>> important,
> >>>> but not for the average user...)
> >>>>
> >>>>
> >>>> Example of class documentation (from sequence.rb):
> >>>> # = DESCRIPTION
> >>>> # The Bio::Sequence class generically describes a 
> nucleic or amino 
> >>>> acid sequence and is a superclass of # Bio::Sequence::NA and 
> >>>> Bio::Sequence::AA. Most methods that can be used on 
> Bio::Sequence 
> >>>> objects are described # in Bio::Sequence::Common, 
> Bio::Sequence::NA 
> >>>> and Bio::Sequence::AA # # If possible, create sequence
> >>> objects using
> >>>> the Bio::Sequence::NA or Bio::Sequence::AA classes 
> instead, as the 
> >>>> Bio::Sequence # class will have to guess the type of
> >>> sequence you're
> >>>> talking about.
> >>>> #
> >>>> # = SYNOPSIS
> >>>> #   # Create a nucleic or amino acid sequence
> >>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
> >>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
> >>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
> >>>> #
> >>>> #   # Print it out
> >>>> #   puts dna.to_s
> >>>> #   puts aa.to_s
> >>>> #
> >>>> #   # Get a subsequence, bioinformatics style (first
> >>> nucleotide is '1')
> >>>> #   puts dna.subseq(2,6)
> >>>> #
> >>>> #   #...more examples from the unit test
> >>>>
> >>>> Example of method documentation (from sequence.rb):
> >>>>   # Usage:
> >>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
> >>>>   #    my_na = my_seq.na
> >>>>   # Function::   Converts the Bio::Sequence object into a
> >>>> Bio::Sequence::NA object
> >>>>   # Returns::    a Bio::Sequence::NA object
> >>>>   # Arguments::  none
> >>>>   def na
> >>>>     @seq = NA.new(@seq)
> >>>>     @moltype = NA
> >>>>   end
> >>>>
> >>>> As the time I can work on this is only limited, expect to
> >>> see gradual
> >>>> additions to the cvs repository. Any other people wishing
> >>> to help out
> >>>> are greatly welcome!!
> >>>>
> >>>> Of course, I promise not to touch other people's code, 
> unless they 
> >>>> explicitely tell me to.
> >>>>
> >>>> Any thoughts/suggestions on this?
> >>>>
> >>>> Kind regards,
> >>>>
> >>>> Jan Aerts, PhD
> >>>> Bioinformatics Group
> >>>> Roslin Institute
> >>>> Roslin, Scotland, UK
> >>>> +33 131 527 4200
> >>>>
> >>>> ---------The obligatory disclaimer-------- The information
> >>> contained
> >>>> in this e-mail (including any attachments) is 
> confidential and is 
> >>>> intended for the use of the addressee
> >>> only.   The
> >>>> opinions expressed within this e-mail (including any
> >>> attachments) are
> >>>> the opinions of the sender and do not necessarily
> >>> constitute those of
> >>>> Roslin Institute (Edinburgh) ("the Institute") unless 
> specifically 
> >>>> stated by a sender who is duly authorised to do so on 
> behalf of the 
> >>>> Institute.
> >>>>
> >>>> _______________________________________________
> >>>> BioRuby mailing list
> >>>> BioRuby at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioruby
> >>>
> >>>
> >
> >
> 
> 


From ktym at hgc.jp  Tue Mar  7 08:38:42 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 7 Mar 2006 22:38:42 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
Message-ID: <895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>

Resend: as this message seems not delivered according to
http://open-bio.org/pipermail/bioruby/2006-March/date.html

-k


Ryan,

Thank you for your very nice doc.

On 2006/03/07, at 0:14, Ryan Raaum wrote:
>    #   s = Bio::Sequence::Generic.new('atggaatga')

If you will utilize this documentation, please change
this example to use Bio::Sequence::NA.

Bio::Sequence::Generic is just for developers - to hold
gaps, spaces etc. intact - mainly for multiple alignment.


In that sense, the following mail you send me personally
should be fixed as:


Begin forwarded message:
> From: Ryan Raaum <rlr215 at nyu.edu>
> Date: 2006?3?7? 0:47:32:JST
> To: Toshiaki Katayama <ktym at hgc.jp>
> Cc: jan aerts (RI) <jan.aerts at bbsrc.ac.uk>
> Subject: BioRuby Sequence Documentation Patch
>
> Hello,
>
> I have begun work on some documentation (as you may have seen from the messages on the mailing list just this morning).  Here is what I've done.  All methods and attributes in the Bio::Sequence hierarchy should be documented.  A summary for each file is yet to be written, but if this documentation is acceptable, I will write those after these are applied.  I made two small code changes as well:
>
> 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect.


The Bio::Sequence::Generic is not intended to strip.

In addition to this, why you made Bio::Sequence#guess method as :nodoc: ?
Seuence type guessing is not perfect so there need to be an interface
to change threshold etc.

Most of other parts seems to be acceptable - you really understood
the behind ideas!

On March 1, Jan also sent me a documented version of sequence.rb.
I'm sorry that I should post it on the list as soon as possible.

Anyway, could you contact him to merge your documentations?
If both of you are agreed, I'll commit your patch.


> 2. Modified the instance randomize method to start at length 0 IF a composition hash is given.  Otherwise, if there was an actual sequence AND a hash was given, odd things would happen.  (Of course, it was never meant to be called that way, but... as it CAN be called that way, I thought the behavior should be consistent.)


Thanks!


> I am happy to make modifications to this documentation,
>
> Best wishes,
>
> Ryan Raaum
>
-------------- next part --------------


From ktym at hgc.jp  Tue Mar  7 12:34:41 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Wed, 8 Mar 2006 02:34:41 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>
	<5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
Message-ID: <AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>

Ryan,

Thank you for your quick fix.
I quote your mail as it was not posted to the list.

> Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function.

Thanks. This bug was introduced when I added length and index arguments...

Toshiaki


On 2006/03/08, at 0:56, Ryan Raaum wrote:

> Hi All,
>
>
>>
>> On 2006/03/07, at 0:14, Ryan Raaum wrote:
>>>    #   s = Bio::Sequence::Generic.new('atggaatga')
>>
>> If you will utilize this documentation, please change
>> this example to use Bio::Sequence::NA.
>>
>
> Done. (For this and all other examples using Bio::Sequence::Generic)
>
>> Bio::Sequence::Generic is just for developers - to hold
>> gaps, spaces etc. intact - mainly for multiple alignment.
>>
>
> Made Bio::Sequence::Generic :nodoc:
>
>>>
>>> 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect.
>>
>>
>> The Bio::Sequence::Generic is not intended to strip.
>
> Removed the added method.
>
>>
>> In addition to this, why you made Bio::Sequence#guess method as :nodoc: ?
>> Seuence type guessing is not perfect so there need to be an interface
>> to change threshold etc.
>
> Documented the guess methods.
>
>> On March 1, Jan also sent me a documented version of sequence.rb.
>> I'm sorry that I should post it on the list as soon as possible.
>>
>> Anyway, could you contact him to merge your documentations?
>> If both of you are agreed, I'll commit your patch.
>
> If Jan will send me his sequence.rb, I can merge it with mine and send the merged file back to Jan.  After he's edited the merge to his liking, we can put it all together and send it in as a unified patch.
>
>
> Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function.  In the line where the "total" is calculated, the original version used the length of the @seq as the starting length, but for the length and index parameters to work properly with the threshold value, the length of the guess string (`str` is the local method variable) is what should be the base length.
>
> Best,
>
> -Ryan
>


From rlr215 at nyu.edu  Wed Mar  8 10:04:18 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Wed, 8 Mar 2006 10:04:18 -0500
Subject: [BioRuby] Sequence Documentation Patch
In-Reply-To: <AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>
	<5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
	<AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>
Message-ID: <d26de0101ebe43ea27a1177b3d18b49a@nyu.edu>

Good Morning,

Jan and I were able to reconcile our respective documentation attempts 
into a single documentation patch.  Here's an example of the final 
format:

(documentation for the Bio::Sequence::Common#subseq method)

   # Returns a new sequence containing the subsequence identified by the
   # start and end numbers given as parameters.  *Important:* Biological
   # sequence numbering conventions (one-based) rather than ruby's
   # (zero-based) numbering conventions are used.
   #
   #   s = Bio::Sequence::NA.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence
   # identical to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   # ---
   # *Arguments*:
   # * (optional) _s_(start): Integer (default 1)
   # * (optional) _e_(end): Integer (default current sequence length)
   # *Returns*:: new Bio::Sequence::NA/AA object

Hopefully this will be useful for new users.

Those changes from the first version of this patch that Toshiaki noted 
as being wrong or against the API were removed.

I also made two more bug fixes (in addition to those already described).

1. Added 'U' and 'u' to the bases counted towards the nucleic acid 
total in Bio::Sequence#guess.  (Without this, RNA sequences were 
"guessed" to be Amino Acid sequences).

2. Changed the arguments for method_missing in Bio::Sequence from 
(*arg) to (sym, *args, &block).  With this argument set, blocks will be 
properly passed through to the encapsulated object.

Cheers!

-Ryan

-------------- next part --------------


On Mar 7, 2006, at 12:34 PM, Toshiaki Katayama wrote:

> Ryan,
>
> Thank you for your quick fix.
> I quote your mail as it was not posted to the list.
>
>> Also, in the last round of editing, I made a small change to the 
>> Bio::Sequence#guess function.
>
> Thanks. This bug was introduced when I added length and index 
> arguments...
>
> Toshiaki
>
>
> On 2006/03/08, at 0:56, Ryan Raaum wrote:
>
>> Hi All,
>>
>>
>>>
>>> On 2006/03/07, at 0:14, Ryan Raaum wrote:
>>>>    #   s = Bio::Sequence::Generic.new('atggaatga')
>>>
>>> If you will utilize this documentation, please change
>>> this example to use Bio::Sequence::NA.
>>>
>>
>> Done. (For this and all other examples using Bio::Sequence::Generic)
>>
>>> Bio::Sequence::Generic is just for developers - to hold
>>> gaps, spaces etc. intact - mainly for multiple alignment.
>>>
>>
>> Made Bio::Sequence::Generic :nodoc:
>>
>>>>
>>>> 1. Added a whitespace stripping initialize method to 
>>>> Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA 
>>>> and Bio::Sequence::NA in that respect.
>>>
>>>
>>> The Bio::Sequence::Generic is not intended to strip.
>>
>> Removed the added method.
>>
>>>
>>> In addition to this, why you made Bio::Sequence#guess method as 
>>> :nodoc: ?
>>> Seuence type guessing is not perfect so there need to be an interface
>>> to change threshold etc.
>>
>> Documented the guess methods.
>>
>>> On March 1, Jan also sent me a documented version of sequence.rb.
>>> I'm sorry that I should post it on the list as soon as possible.
>>>
>>> Anyway, could you contact him to merge your documentations?
>>> If both of you are agreed, I'll commit your patch.
>>
>> If Jan will send me his sequence.rb, I can merge it with mine and 
>> send the merged file back to Jan.  After he's edited the merge to his 
>> liking, we can put it all together and send it in as a unified patch.
>>
>>
>> Also, in the last round of editing, I made a small change to the 
>> Bio::Sequence#guess function.  In the line where the "total" is 
>> calculated, the original version used the length of the @seq as the 
>> starting length, but for the length and index parameters to work 
>> properly with the threshold value, the length of the guess string 
>> (`str` is the local method variable) is what should be the base 
>> length.
>>
>> Best,
>>
>> -Ryan
>>
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From jan.aerts at bbsrc.ac.uk  Tue Mar 21 07:38:30 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Tue, 21 Mar 2006 12:38:30 -0000
Subject: [BioRuby] fastacmd.rb: iteration
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>

Hi,

Could someone please have a look at the each_entry method of
io/fastacmd.rb (in cvs)? The code below gives the sequences of
'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
output. Any ideas?

  fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
  seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
  seqs.each do |seq|
    puts seq                        => works fine
  end

  fastacmd.each_entry do |fasta|
    puts 'hi'                       => it never seems to get here...
  end

Thanks,
Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+44 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From ngoto at gen-info.osaka-u.ac.jp  Wed Mar 22 05:29:27 2006
From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa)
Date: Wed, 22 Mar 2006 19:29:27 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>

Hi jan,

I found a bug in the Bio::FlatFile. Because io/fastacmd.rb
internally uses FlatFile, the bug may be related to the problem.

The bug is that IO#pos raises error when the IO object isn't
a regular file (e.g. pipe) but FlatFile always tried to get pos.
It is fixed in the CVS now.

On Tue, 21 Mar 2006 12:38:30 -0000
"jan aerts \(RI\)" <jan.aerts at bbsrc.ac.uk> wrote:

> Hi,
> 
> Could someone please have a look at the each_entry method of
> io/fastacmd.rb (in cvs)? The code below gives the sequences of
> 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
> output. Any ideas?
> 
>   fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
>   seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
>   seqs.each do |seq|
>     puts seq                        => works fine
>   end
> 
>   fastacmd.each_entry do |fasta|
>     puts 'hi'                       => it never seems to get here...
>   end
> 
> Thanks,
> Jan Aerts, PhD
> Bioinformatics Group
> Roslin Institute
> Roslin, Scotland, UK
> +44 131 527 4200
> 
> ---------The obligatory disclaimer--------
> The information contained in this e-mail (including any attachments) is
> confidential and is intended for the use of the addressee only.   The
> opinions expressed within this e-mail (including any attachments) are
> the opinions of the sender and do not necessarily constitute those of
> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> stated by a sender who is duly authorised to do so on behalf of the
> Institute. 

-- 
Naohisa GOTO
ngoto at gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan

From k at bioruby.org  Sat Mar 25 03:01:05 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Sat, 25 Mar 2006 17:01:05 +0900
Subject: [BioRuby] Important news for bioruby developers
In-Reply-To: <FB2D7CB3-51AB-4AB7-B2F9-B807AF637674@sonsorol.org>
References: <FB2D7CB3-51AB-4AB7-B2F9-B807AF637674@sonsorol.org>
Message-ID: <0EA907FB-27BB-4A78-B173-D7F4F0AB4A85@bioruby.org>

Hi Chris,

Thank you for taking care of the server migration.

On 2006/03/22, at 1:32, Chris Dagdigian wrote:

> Hello,
>
> Sorry for the interruption but I've got some important site and server news. People will also see multiple copies of this note as I slowly transition sites over.
>
> We are in the midst of moving all of our websites, mailing lists, developers and sourcecode repositories onto more modern hardware located in a 2nd Boston area datacenter facility.
>
> This may not be a big deal for bioruby since your website, wiki and news site are not hosted by Open Bio.  Keep reading though as there are some questions/favors  I need to ask of the Ruby developers down below ...
>
> The transition is important for a couple of reasons - the most urgent being that we are going to lose internet connectivity in our current hosting facility on March 27th 2006.  That datacenter belongs to Wyeth Research in Cambridge, Massachusetts.  Wyeth Research & Genetics Institute have been long time significant supporters & hosting providers for OBF servers and projects -- we owe them a great deal of gratitude and public acknowledgment for hosting our servers over many years. Speaking as a hardware geek I can tell you that the many years of high-bandwidth, trouble free hosting have been invaluable for our efforts and projects.   Sadly, it is no longer possible for them to host our servers as they need to begin making some network and WAN circuit changes that will no longer support direct internet facing servers (such as ours) in Cambridge.
>
> The other major reason for the transition is our need to relocate onto hardware that can better be remotely managed (as our volunteer administrators are scattered all over the globe).
>
> My employer, BioTeam Inc. has donated new server hardware and is also providing the hosting facilities in a Tier 1 Boston area colocation facility. Infrastructure geeks can see pictures of the colocation  cage and the new OBF servers online at this URL:
> http://bioteam.net/gallery/bioteamBDC  -- those servers also host EMBOSS FTP/CVS and mailing lists.
>
> Current status of the migration:
>
>  - All 57 mailing lists have been moved over to the new hardware (you may have noticed "lists.open-bio.org" showing up in your list messages)
>
>  - The new anonymous sourcecode server is running at http://code.open-bio.org. "cvs.biodas,.org" is already pointing at it.
>
>  - Developers with CVS accounts have *NOT* been migrated yet
>
> Basically we are trying to relocate everything but the developers over the next few days so we can spend the weekend on the developer and CVS transition.
>
>
> Attention BioRuby Developers
> -----------------------------------------
>
> I need assistance with the following:
>
> (1) Please confirm to me or support at open-bio.org that you have NO websites running on Open Bio servers. It appears you host your own wiki/news/web sites


Yes, we have NO websites on Open Bio servers for now.


> (2) Please change your website front page to reflect the new URLs for your mailing lists:
>
>    http://lists.open-bio.org/mailman/listinfo/bioruby
>    http://lists.open-bio.org/mailman/listinfo/bioruby-cvs
>    http://lists.open-bio.org/mailman/listinfo/bioruby-ja


Done.


> (3) Please CNAME alias or web forward "cvs.bioruby.org"  to code.open-bio.org to use t


Does this mean "cvs.open-bio.org" is no longer available or not synced with "code.open-bio.org"?
Currently, we forward "http://cvs.bioruby.org" to "http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/?cvsroot=bioruby"


> (4) However you do your mail forwarding, please make sure that mail for bioruby.org mailing lists gets redirected to "lists.open-bio.org"


Done.


> For people with CVS commit/write access
> ---------------------------------------------------------
> Also note that when we finally do transition over to the new developer machine (where the real sourcecode lives), ALL developers will need to email support at open-bio.org to request a password reset. Although we can transition usernames, settings and home directories over from the old to the new machine we can not transition over existing passwords as they are stored in incompatible hashed formats. All developers are going to need new passwords for the new developer machine.  We will likely make the developer machine swap this weekend.
>
>
> Reporting Problems / Help & Assistance
> ------------------------------------------------------
> The transition will be complicated, we need your help to spot problems and glitches! The OBF has a new helpdesk ticketing system set up at "support at open-bio.org" so that all OBF admins can read and respond to issues and problems. Most troubles should be reported to that address. For urgent problems, especially during this transition period,  feel free to contact me directly (dag at sonsorol.org) (ichat/aol/aim screen name:  bioteamdag).
>
>
> Regards,
> Chris Dagdigian
> open-bio.org
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


From k at bioruby.org  Sat Mar 25 19:46:18 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Sun, 26 Mar 2006 09:46:18 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
	<200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
Message-ID: <D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>

Goto-san,

# post testing for new open-bio.org server :)

I've suggested to release 1.0.1 (or create a stable branch?)
How do you think?

This bug causes Bio::FlatFile with ARGF to fail at the last iteration
and it may be fairly serious problem for many users.

By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?

Toshiaki

--------------------------------------------------
% cat test_ff.rb
require 'bio'

ff = Bio::FlatFile.new(Bio::FastaFormat, ARGF)
ff.each do |e|
  p e.definition
end
% cat test.fa
>b0002
atgcgagtgtt
>b0003
atggttaaagt
>b0004
atgaaactcta
% ruby test_ff.rb test.fa
"b0002"
"b0003"
"b0004"
/usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos': no stream to tell (ArgumentError)
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:342:in `get_entry'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:573:in `next_entry'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:609:in `each'
        from test_ff.rb:4
--------------------------------------------------

On 2006/03/22, at 19:29, GOTO Naohisa wrote:

> Hi jan,
>
> I found a bug in the Bio::FlatFile. Because io/fastacmd.rb
> internally uses FlatFile, the bug may be related to the problem.
>
> The bug is that IO#pos raises error when the IO object isn't
> a regular file (e.g. pipe) but FlatFile always tried to get pos.
> It is fixed in the CVS now.
>
> On Tue, 21 Mar 2006 12:38:30 -0000
> "jan aerts \(RI\)" <jan.aerts at bbsrc.ac.uk> wrote:
>
>> Hi,
>>
>> Could someone please have a look at the each_entry method of
>> io/fastacmd.rb (in cvs)? The code below gives the sequences of
>> 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
>> output. Any ideas?
>>
>>   fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
>>   seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
>>   seqs.each do |seq|
>>     puts seq                        => works fine
>>   end
>>
>>   fastacmd.each_entry do |fasta|
>>     puts 'hi'                       => it never seems to get here...
>>   end
>>
>> Thanks,
>> Jan Aerts, PhD
>> Bioinformatics Group
>> Roslin Institute
>> Roslin, Scotland, UK
>> +44 131 527 4200
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute. 
>
> -- 
> Naohisa GOTO
> ngoto at gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ktym at hgc.jp  Sat Mar 25 20:28:14 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Sun, 26 Mar 2006 10:28:14 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio command.rb,
	1.3, 1.4)
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
Message-ID: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>

Goto-san,

> +   # Same as OpenURI.open_uri(*arg).
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used because
> +   # open-uri.rb redefines Kernel.open.

Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
searching a particular method from it etc...

I don't understand what the complicated part of your Sandbox module actually does
(or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
something like

> require 'open-uri'
>
> module Kernel
>   private
>   alias open_uri open
>   alias open open_uri_original_open
> end

isn't enough?


Regards,
Toshiaki Katayama


-----test_open_uri.rb
#!/usr/bin/env ruby

require 'open-uri'

module Kernel
  private
  alias open_uri open
  alias open open_uri_original_open
end

url = "http://bioruby.org"

p "########## open_uri"
open_uri(url) do |f|
  puts f.read
end

p "########## open"
open(url) do |f|
  puts f.read
end


Begin forwarded message:

> From: Naohisa Goto <ngoto at pub.open-bio.org>
> Date: 2006?3?20? 19:34:59:JST
> To: bioruby-cvs at portal.open-bio.org
> Subject: [BioRuby-cvs] bioruby/lib/bio command.rb,1.3,1.4
>
> Update of /home/repository/bioruby/bioruby/lib/bio
> In directory pub.open-bio.org:/tmp/cvs-serv30042/lib/bio
>
> Modified Files:
> 	command.rb 
> Log Message:
> * New module Bio::Command::NetTools for miscellaneous network methods.
>   Currently, this module is intended to be used only inside
>   BioRuby library. Please do not use it in user's programs now.
> * New methods: Bio::Command::NetTools.open_uri(uri, *arg) and
>   Bio::Command::NetTools.read_uri(uri).
> * Changed license to Ruby's.
>
>
> Index: command.rb
> ===================================================================
> RCS file: /home/repository/bioruby/bioruby/lib/bio/command.rb,v
> retrieving revision 1.3
> retrieving revision 1.4
> diff -C2 -d -r1.3 -r1.4
> *** command.rb	4 Nov 2005 17:36:00 -0000	1.3
> --- command.rb	20 Mar 2006 10:34:57 -0000	1.4
> ***************
> *** 2,32 ****
>   # = bio/command.rb - general methods for external command execution
>   #
> ! # Copyright::	Copyright (C) 2003-2005
>   # 		Naohisa Goto <ng at bioruby.org>,
>   #		Toshiaki Katayama <k at bioruby.org>
> ! # License::	LGPL
>   #
>   #  $Id$
>   #
> - #--
> - #
> - #  This library is free software; you can redistribute it and/or
> - #  modify it under the terms of the GNU Lesser General Public
> - #  License as published by the Free Software Foundation; either
> - #  version 2 of the License, or (at your option) any later version.
> - #
> - #  This library is distributed in the hope that it will be useful,
> - #  but WITHOUT ANY WARRANTY; without even the implied warranty of
> - #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> - #  Lesser General Public License for more details.
> - #
> - #  You should have received a copy of the GNU Lesser General Public
> - #  License along with this library; if not, write to the Free Software
> - #  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307  USA
> - #
> - #++
> - #
>
>   require 'open3'
>
>   module Bio
> --- 2,15 ----
>   # = bio/command.rb - general methods for external command execution
>   #
> ! # Copyright::	Copyright (C) 2003-2006
>   # 		Naohisa Goto <ng at bioruby.org>,
>   #		Toshiaki Katayama <k at bioruby.org>
> ! # License::	Ruby's
>   #
>   #  $Id$
>   #
>
>   require 'open3'
> + require 'uri'
>
>   module Bio
> ***************
> *** 162,165 ****
> --- 145,291 ----
>
>   end # module Tools
> + 
> + 
> + # = Bio::Command::NetTools
> + #
> + # Bio::Command::NetTools is a collection of miscellaneous methods
> + # for data transport through network.
> + #
> + # Library internal use only. Users should not directly use it.
> + #
> + # Note that it is under construction.
> + module NetTools
> + 
> +   # Same as OpenURI.open_uri(*arg).
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used because
> +   # open-uri.rb redefines Kernel.open.
> +   def self.open_uri(uri, *arg)
> +     if defined? ::OpenURI
> +       ::OpenURI.open_uri(uri, *arg)
> +     else
> +       SandBox.load_openuri_in_sandbox
> +       uri = uri.to_s if ::URI::Generic === uri
> +       SandBox::OpenURI.open_uri(uri, *arg)
> +     end
> +   end
> + 
> +   # Same as OpenURI.open_uri(uri).read.
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used becase
> +   # open-uri.rb redefines Kernel.open.
> +   def self.read_uri(uri)
> +     self.open_uri(uri).read
> +   end
> + 
> +   # Sandbox to load open-uri.rb.
> +   # Internal use only.
> +   module SandBox #:nodoc:
> + 
> +     # Dummy module definition.
> +     module Kernel #:nodoc:
> +       # dummy method
> +       def open(*arg); end #:nodoc:
> +     end #module Kernel
> +     
> +     # a method to find proxy. dummy definition
> +     module FindProxy; end #:nodoc:
> +     
> +     # dummy module definition
> +     module OpenURI #:nodoc:
> +       module OpenRead; end #:nodoc:
> +     end #module OpenURI
> +     
> +     # Dummy module definition.
> +     module URI #:nodoc:
> +       class Generic < ::URI::Generic #:nodoc:
> +         include SandBox::FindProxy
> +       end
> +       
> +       class HTTPS < ::URI::HTTPS #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       class HTTP  < ::URI::HTTP  #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       class FTP  < ::URI::FTP    #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       # parse and new. internal use only.
> +       def self.__parse_and_new__(klass, uri) #:nodoc:
> +         scheme, userinfo, host, port,
> +         registry, path, opaque, query, fragment = ::URI.split(uri)
> +         klass.new(scheme, userinfo, host, port,
> +                   registry, path, opaque, query,
> +                   fragment)
> +       end
> +       private_class_method :__parse_and_new__
> +       
> +       # same as ::URI.parse. internal use only.
> +       def self.parse(uri) #:nodoc:
> +         r = ::URI.parse(uri)
> +         case r
> +         when ::URI::HTTPS
> +           __parse_and_new__(HTTPS, uri)
> +         when ::URI::HTTP
> +           __parse_and_new__(HTTP, uri)
> +         when ::URI::FTP
> +           __parse_and_new__(FTP, uri)
> +         else
> +           r
> +         end
> +       end
> +     end #module URI
> +     
> +     @load_openuri = nil
> +     # load open-uri.rb in SandBox module.
> +     def self.load_openuri_in_sandbox #:nodoc:
> +       return if @load_openuri
> +       fn = nil
> +       unless $:.find do |x|
> +           fn = File.join(x, 'open-uri.rb')
> +           FileTest.exist?(fn)
> +         end then
> +         warn('Warning: cannot find open-uri.rb in $LOAD_PATH')
> +       else
> +         # reading open-uri.rb
> +         str = File.read(fn)
> +         # eval open-uri.rb contents in SandBox module
> +         module_eval(str)
> +         
> +         # finds 'find_proxy' method
> +         find_proxy_lines = nil
> +         flag = nil
> +         endstr = nil
> +         str.each do |line|
> +           if flag then
> +             find_proxy_lines << line
> +             if endstr == line[0, endstr.length] and
> +                 /^\s+end(\s+.*)?$/ =~ line then
> +               break
> +             end
> +           elsif /^(\s+)def\s+find_proxy(\s+.*)?$/ =~ line then
> +             flag = true
> +             endstr = "#{$1}end"
> +             find_proxy_lines = line 
> +           end
> +         end
> +         if find_proxy_lines
> +           module_eval("module FindProxy;\n#{find_proxy_lines}\n;end\n")
> +         else
> +           warn('Warning: cannot find find_proxy method in open-uri.rb.')
> +         end
> +         @load_openuri = true
> +       end
> +     end
> +   end #module SandBox
> + end #module NetTools
> + 
>   end # module Command
>   end # module Bio
>
> _______________________________________________
> bioruby-cvs mailing list
> bioruby-cvs at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby-cvs


From ngoto at gen-info.osaka-u.ac.jp  Sun Mar 26 01:13:28 2006
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 26 Mar 2006 15:13:28 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
References: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
	<D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
Message-ID: <20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>

Hi,

> I've suggested to release 1.0.1 (or create a stable branch?)
> How do you think?

I agree.

> By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
> Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?

I referred specifications of Ruby's File, IO and Dir classes.
File.open, IO.open, and Dir.open can accept a block but
File.new, IO.new, and Dir.new don't.
Because Ruby's experts have determined such specifications,
I suppose that there may be something merits not to accept blocks
or there may be something problems to accept a block,
but I don't know much about them.

[ruby-list:24986] said that in Ruby 1.6.0, IO.new and Dir.new
was changed not to take block, but I can't find the reason.
( http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/24986 )

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp


From ngoto at gen-info.osaka-u.ac.jp  Sun Mar 26 01:56:07 2006
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 26 Mar 2006 15:56:07 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio
	command.rb, 1.3, 1.4)
In-Reply-To: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
	<29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
Message-ID: <20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>

> > +   # Same as OpenURI.open_uri(*arg).
> > +   # If open-uri.rb is already loaded, ::OpenURI is used.
> > +   # Otherwise, internal OpenURI in sandbox is used because
> > +   # open-uri.rb redefines Kernel.open.
> 
> Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
> searching a particular method from it etc...

Yes, it's very complicated. It's easier to copy-and-pastepart of
open-uri.rb but this may cause copyright problem.

The easiest way is to write "Please be careful that BioRuby now
require open-uri and the bahavior of open() is changed."
in the documents of BioRuby and uses OpenURI.open_uri.

In very rare case, the require "oprn-uri" changes behaviors.
For example,

% mkdir -p http://www.google.com
% echo "hello world" > http://www.google.com/index.html
% ls -R
.:
http:/

./http::
www.google.com/

./http:/www.google.com:
index.html
% irb
irb(main):001:0> open("http://www.google.com/index.html"){|f| f.read }
=> "hello world\n"
irb(main):002:0> require "open-uri"
=> true
irb(main):003:0> open("http://www.google.com/index.html"){|f| f.read }
=> "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=S
hift_JIS\"><title>Google</title><style><!--\nbody,td,a,p,.h{font-family:;}\n.h{f
ont-size: 20px;}\n.q{color:#0000cc;}\n//-->\n</style>\n<script>\n<!--\nfunction 
(snip)

However, I think this is very rare case. In addition, to open a local
file, it is recommended using File.open, because Kernel#open
accepts shell special characters such as "|rm -rf *".
(For the same reason, it is recommended to use OpenURI.open_uri
to open a URI.)

> I don't understand what the complicated part of your Sandbox module actually does
> (or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
> something like
> 
> > require 'open-uri'
> >
> > module Kernel
> >   private
> >   alias open_uri open
> >   alias open open_uri_original_open
> > end
> 
> isn't enough?

No. Above code kills open_uri's extension to the Kernel#open,
and users who want to use the extended Kernel#open will be failed.

--
Naohisa GOTO
ngoto at gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan


From k at bioruby.org  Mon Mar 27 02:17:19 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Mon, 27 Mar 2006 16:17:19 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>
References: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
	<D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
	<20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>
Message-ID: <538B4544-9F77-4D2F-8A80-A64EB3728439@bioruby.org>

On 2006/03/26, at 15:13, Naohisa Goto wrote:
>> I've suggested to release 1.0.1 (or create a stable branch?)
>> How do you think?
>
> I agree.

OK, let's prepare for the next release within a week (hopefully).
Need to determine - create a branch or archive current HEAD.
Are there any developer that your code in CVS HEAD is not ready for release?

>> By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
>> Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?
>
> I referred specifications of Ruby's File, IO and Dir classes.
> File.open, IO.open, and Dir.open can accept a block but
> File.new, IO.new, and Dir.new don't.

I understand. Thank you!

Toshiaki


From ktym at hgc.jp  Mon Mar 27 02:09:11 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Mon, 27 Mar 2006 16:09:11 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio
	command.rb, 1.3, 1.4)
In-Reply-To: <20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
	<29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
	<20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>
Message-ID: <79459264-E988-45F4-8809-53B52044D81D@hgc.jp>

Hmm, so my understanding is

* open-uri.rb sucks - doesn't allow to require w/o override Kernel#open
* but Kernel#open also sucks - use File.open instead

Thus, by KISS principle, I want to take the easiest way :)

* require 'open-uri' (in lib/bio.rb?) and add documentation about that.
* always use OpenURI.open_uri instead of Kernel#open or Net::HTTP#get

This is mainly for easier setup of the HTTP proxy (than 'net/http').

Toshiaki


On 2006/03/26, at 15:56, Naohisa Goto wrote:

>>> +   # Same as OpenURI.open_uri(*arg).
>>> +   # If open-uri.rb is already loaded, ::OpenURI is used.
>>> +   # Otherwise, internal OpenURI in sandbox is used because
>>> +   # open-uri.rb redefines Kernel.open.
>>
>> Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
>> searching a particular method from it etc...
>
> Yes, it's very complicated. It's easier to copy-and-pastepart of
> open-uri.rb but this may cause copyright problem.
>
> The easiest way is to write "Please be careful that BioRuby now
> require open-uri and the bahavior of open() is changed."
> in the documents of BioRuby and uses OpenURI.open_uri.
>
> In very rare case, the require "oprn-uri" changes behaviors.
> For example,
>
> % mkdir -p http://www.google.com
> % echo "hello world" > http://www.google.com/index.html
> % ls -R
> .:
> http:/
>
> ./http::
> www.google.com/
>
> ./http:/www.google.com:
> index.html
> % irb
> irb(main):001:0> open("http://www.google.com/index.html"){|f| f.read }
> => "hello world\n"
> irb(main):002:0> require "open-uri"
> => true
> irb(main):003:0> open("http://www.google.com/index.html"){|f| f.read }
> => "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=S
> hift_JIS\"><title>Google</title><style><!--\nbody,td,a,p,.h{font-family:;}\n.h{f
> ont-size: 20px;}\n.q{color:#0000cc;}\n//-->\n</style>\n<script>\n<!--\nfunction 
> (snip)
>
> However, I think this is very rare case. In addition, to open a local
> file, it is recommended using File.open, because Kernel#open
> accepts shell special characters such as "|rm -rf *".
> (For the same reason, it is recommended to use OpenURI.open_uri
> to open a URI.)
>
>> I don't understand what the complicated part of your Sandbox module actually does
>> (or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
>> something like
>>
>>> require 'open-uri'
>>>
>>> module Kernel
>>>   private
>>>   alias open_uri open
>>>   alias open open_uri_original_open
>>> end
>>
>> isn't enough?
>
> No. Above code kills open_uri's extension to the Kernel#open,
> and users who want to use the extended Kernel#open will be failed.
>
> --
> Naohisa GOTO
> ngoto at gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
>
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From k at bioruby.org  Tue Mar 28 20:37:09 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Wed, 29 Mar 2006 10:37:09 +0900
Subject: [BioRuby] Fwd: [Open-bio-l] Announcing BOSC 2006
References: <44294B65.4050207@duke.edu>
Message-ID: <DEF295C3-929B-4CF4-ADF1-E3ED07C60E77@bioruby.org>

I'll forward the announcement of BOSC 2006 -
as bioruby at lists.open-bio.org rejects postings from non-member.

Regards,
Toshiaki

Begin forwarded message:

> From: Darin London <darin.london at duke.edu>
> Date: 2006?3?28? 23:42:45:JST
> To: Authors at lists.open-bio.org, BioBiz at lists.open-bio.org, Biocorba-announce-l at lists.open-bio.org, Biocorba-l at lists.open-bio.org, Biograph at lists.open-bio.org, bioinfo-core at lists.open-bio.org, biojava-dev at lists.open-bio.org, Biojava-l at lists.open-bio.org, bioped-l at lists.open-bio.org, Bioperl-announce-l at lists.open-bio.org, Bioperl-l at lists.open-bio.org, bioperl-microarray at lists.open-bio.org, bioperl-pipeline at lists.open-bio.org, BioPython at lists.open-bio.org, BioPython-announce at lists.open-bio.org, Biopython-dev at lists.open-bio.org, BioRuby at lists.open-bio.org, BioRuby-ja at lists.open-bio.org, Biosoap-l at lists.open-bio.org, BioSQL-l at lists.open-bio.org, BP-announce at lists.open-bio.org, DAS at lists.open-bio.org, DAS-announce at lists.open-bio.org, DAS2 at lists.open-bio.org, Dynamite at lists.open-bio.org, EMBOSS at lists.open-bio.org, emboss-announce at lists.open-bio.org, emboss-dev at lists.open-bio.org, Moby-announce at lists.open-bio.org, MOBY-dev at lists.open-bio.org, moby-l at lists.open-bio.org, obf-developers at lists.open-bio.org, Ontologies at lists.open-bio.org, Open-bio-announce at lists.open-bio.org, Open-Bio-l at lists.open-bio.org, Open-Bioinformatics-Foundation at lists.open-bio.org
> Subject: [Open-bio-l] Announcing BOSC 2006
> Reply-To: bosc at open-bio.org
>
> MEETING ANNOUNCEMENT & CALL FOR SPEAKERS
>
> The 7th annual Bioinformatics Open Source Conference (BOSC 2006) is
> organized by the
> not-for-profit Open Bioinformatics Foundation. The meeting will take place
> Aug 4,5th in Fortaleza, Brasil, and is one of several Special Interest
> Group (SIG) meetings occurring in conjunction with the 14th
> International Conference
> on Intelligent Systems for Molecular Biology.  Please consult The Official
> BOSC 2006 Website at
>
> http://www.open-bio.org/wiki/BOSC_2006
>
> for details and information. 
>
> In addition, a BOSC weblog has been setup to make it easier to
> desiminate all BOSC
> related announcements:
>
> http://wiki.open-bio.org/boscblog/
>
> And if you have an ICAL compatible Calendar, there is an EventDB calendar
> set up with all BOSC related deadlines.
>
> http://eventful.com/groups/G0-001-000014747-0
>
> More information about ISMB can be found at the Official
> ISMB 2006 Website:
>
> http://ismb2006.cbi.cnptia.embrapa.br/
>
>
> Thank You, and we look forward to seeing you all,
> The BOSC Organizing Committee.
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 14:21:54 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:21:54 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>

Hi all,

Given the posts about bioruby documentation in the last few months, my
own experiences with bioruby and a bit of encouragement from Toshiaki,
I'd like to commence documenting bioruby classes (in CVS) that are not
documented yet, and to standardize the documentation format for those
that already have documentation.

Documentation would take the form of rdoc, so that it would be browsable
via the www.bioruby.org/rdoc website.

Some guidelines that I would like to use in the documentation:
(1) Each class should have a description and synopsis. If there is a
unit test at the bottom, this can easily be tweaked into a synopsis. If
such a unit test is available, 'documentating' would mean (at least in
the first round) 'tweaking and copying the unit test in a comment in
front of the class'. Alternatively, unit tests and documentation could
be combined into one (as Ara and Pjotr discussed), but I'm not
experienced enough in ruby yet to do this in a simple, transparent way.
(2) Given the effort developers have put into writing the classes, it
would be nice if bioruby could reach as wide an audience as possible.
What I believe would help tremendously, is a standardized format for
documentation. By this I mean that the following information is given
for each method (sort of like in bioperl documentation):
    * synopsis
    * description
    * function
    * what it returns
    * any arguments
(3) It should be made clear to the user if a class should be used
directly, or if it just supports other classes (e.g.
Bio::Sequence::Format). Additional important info would be interaction
with other classes (e.g. "how does the sequence class interact with the
embl class?"). Original module writers have an important role in
describing this context.
(4) Encapsule the copyright information between '#--' and '#++', as it
distracts the user from what he/she wants to know. (It _is_ important,
but not for the average user...)


Example of class documentation (from sequence.rb):
# = DESCRIPTION
# The Bio::Sequence class generically describes a nucleic or amino acid
sequence and is a superclass of 
# Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used
on Bio::Sequence objects are described
# in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA
#
# If possible, create sequence objects using the Bio::Sequence::NA or
Bio::Sequence::AA classes instead, as the Bio::Sequence
# class will have to guess the type of sequence you're talking about.
# 
# = SYNOPSIS
#   # Create a nucleic or amino acid sequence
#   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
#   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
#   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
# 
#   # Print it out
#   puts dna.to_s
#   puts aa.to_s
# 
#   # Get a subsequence, bioinformatics style (first nucleotide is '1')
#   puts dna.subseq(2,6)
# 
#   #...more examples from the unit test

Example of method documentation (from sequence.rb):
  # Usage:
  #    my_seq = Bio::Sequence('AGGCACGAT')
  #    my_na = my_seq.na
  # Function::   Converts the Bio::Sequence object into a
Bio::Sequence::NA object
  # Returns::    a Bio::Sequence::NA object
  # Arguments::  none
  def na
    @seq = NA.new(@seq)
    @moltype = NA
  end

As the time I can work on this is only limited, expect to see gradual
additions to the cvs repository. Any other people wishing to help out
are greatly welcome!!

Of course, I promise not to touch other people's code, unless they
explicitely tell me to.

Any thoughts/suggestions on this?

Kind regards,

Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+33 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 14:21:54 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:21:54 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>

Hi all,

Given the posts about bioruby documentation in the last few months, my
own experiences with bioruby and a bit of encouragement from Toshiaki,
I'd like to commence documenting bioruby classes (in CVS) that are not
documented yet, and to standardize the documentation format for those
that already have documentation.

Documentation would take the form of rdoc, so that it would be browsable
via the www.bioruby.org/rdoc website.

Some guidelines that I would like to use in the documentation:
(1) Each class should have a description and synopsis. If there is a
unit test at the bottom, this can easily be tweaked into a synopsis. If
such a unit test is available, 'documentating' would mean (at least in
the first round) 'tweaking and copying the unit test in a comment in
front of the class'. Alternatively, unit tests and documentation could
be combined into one (as Ara and Pjotr discussed), but I'm not
experienced enough in ruby yet to do this in a simple, transparent way.
(2) Given the effort developers have put into writing the classes, it
would be nice if bioruby could reach as wide an audience as possible.
What I believe would help tremendously, is a standardized format for
documentation. By this I mean that the following information is given
for each method (sort of like in bioperl documentation):
    * synopsis
    * description
    * function
    * what it returns
    * any arguments
(3) It should be made clear to the user if a class should be used
directly, or if it just supports other classes (e.g.
Bio::Sequence::Format). Additional important info would be interaction
with other classes (e.g. "how does the sequence class interact with the
embl class?"). Original module writers have an important role in
describing this context.
(4) Encapsule the copyright information between '#--' and '#++', as it
distracts the user from what he/she wants to know. (It _is_ important,
but not for the average user...)


Example of class documentation (from sequence.rb):
# = DESCRIPTION
# The Bio::Sequence class generically describes a nucleic or amino acid
sequence and is a superclass of 
# Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used
on Bio::Sequence objects are described
# in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA
#
# If possible, create sequence objects using the Bio::Sequence::NA or
Bio::Sequence::AA classes instead, as the Bio::Sequence
# class will have to guess the type of sequence you're talking about.
# 
# = SYNOPSIS
#   # Create a nucleic or amino acid sequence
#   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
#   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
#   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
# 
#   # Print it out
#   puts dna.to_s
#   puts aa.to_s
# 
#   # Get a subsequence, bioinformatics style (first nucleotide is '1')
#   puts dna.subseq(2,6)
# 
#   #...more examples from the unit test

Example of method documentation (from sequence.rb):
  # Usage:
  #    my_seq = Bio::Sequence('AGGCACGAT')
  #    my_na = my_seq.na
  # Function::   Converts the Bio::Sequence object into a
Bio::Sequence::NA object
  # Returns::    a Bio::Sequence::NA object
  # Arguments::  none
  def na
    @seq = NA.new(@seq)
    @moltype = NA
  end

As the time I can work on this is only limited, expect to see gradual
additions to the cvs repository. Any other people wishing to help out
are greatly welcome!!

Of course, I promise not to touch other people's code, unless they
explicitely tell me to.

Any thoughts/suggestions on this?

Kind regards,

Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+33 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 14:55:08 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 14:55:08 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>

Hi Ryan,

(First of all: I think you sent this message to me alone, instead of the
bioruby mailing list....)

Glad to get the documentation discussion started again... The "as a way
of thorougly understanding the use and structure of the classes" sound
familiar...

What do you think of using a standardized or (sound ugly:) formal
format? Does your documentation include some of the
synopsis/description/function/what it returns/arguments things? Do you
think it is useful/feasible to put them in that format?

Thanks,
jan.

> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu] 
> Sent: 06 March 2006 14:41
> To: jan aerts (RI)
> Subject: Re: [BioRuby] bioruby documentation
> 
> Good Morning All,
> 
> I've had similar toughts to Jan, and am a couple methods away 
> from completely documenting Bio::Sequence::* .  I was hoping 
> to send that in to Toshiaki later today.  I haven't yet 
> written a synopsis or description for them, mainly because I 
> was using the process of documenting all the methods as a way 
> of thoroughly understanding the use and structure of the 
> classes.  If the documentation I've currently written is seen 
> as reasonable and accepted, I would then add the overview 
> documentation for those classes and files.
> 
> Is there somewhere we can note which parts different people 
> are working on documenting, so as to avoid any duplication of effort?
> 
> Best!
> 
> -Ryan
> 
> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
> 
> > Hi all,
> >
> > Given the posts about bioruby documentation in the last few 
> months, my 
> > own experiences with bioruby and a bit of encouragement 
> from Toshiaki, 
> > I'd like to commence documenting bioruby classes (in CVS) 
> that are not 
> > documented yet, and to standardize the documentation format 
> for those 
> > that already have documentation.
> >
> > Documentation would take the form of rdoc, so that it would be 
> > browsable via the www.bioruby.org/rdoc website.
> >
> > Some guidelines that I would like to use in the documentation:
> > (1) Each class should have a description and synopsis. If 
> there is a 
> > unit test at the bottom, this can easily be tweaked into a 
> synopsis. 
> > If such a unit test is available, 'documentating' would 
> mean (at least 
> > in the first round) 'tweaking and copying the unit test in 
> a comment 
> > in front of the class'. Alternatively, unit tests and documentation 
> > could be combined into one (as Ara and Pjotr discussed), 
> but I'm not 
> > experienced enough in ruby yet to do this in a simple, 
> transparent way.
> > (2) Given the effort developers have put into writing the 
> classes, it 
> > would be nice if bioruby could reach as wide an audience as 
> possible.
> > What I believe would help tremendously, is a standardized 
> format for 
> > documentation. By this I mean that the following 
> information is given 
> > for each method (sort of like in bioperl documentation):
> >     * synopsis
> >     * description
> >     * function
> >     * what it returns
> >     * any arguments
> > (3) It should be made clear to the user if a class should be used 
> > directly, or if it just supports other classes (e.g.
> > Bio::Sequence::Format). Additional important info would be 
> interaction 
> > with other classes (e.g. "how does the sequence class interact with 
> > the embl class?"). Original module writers have an 
> important role in 
> > describing this context.
> > (4) Encapsule the copyright information between '#--' and 
> '#++', as it 
> > distracts the user from what he/she wants to know. (It _is_ 
> important, 
> > but not for the average user...)
> >
> >
> > Example of class documentation (from sequence.rb):
> > # = DESCRIPTION
> > # The Bio::Sequence class generically describes a nucleic or amino 
> > acid sequence and is a superclass of # Bio::Sequence::NA and 
> > Bio::Sequence::AA. Most methods that can be used on Bio::Sequence 
> > objects are described # in Bio::Sequence::Common, Bio::Sequence::NA 
> > and Bio::Sequence::AA # # If possible, create sequence 
> objects using 
> > the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the 
> > Bio::Sequence # class will have to guess the type of 
> sequence you're 
> > talking about.
> > #
> > # = SYNOPSIS
> > #   # Create a nucleic or amino acid sequence
> > #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
> > #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
> > #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
> > #
> > #   # Print it out
> > #   puts dna.to_s
> > #   puts aa.to_s
> > #
> > #   # Get a subsequence, bioinformatics style (first 
> nucleotide is '1')
> > #   puts dna.subseq(2,6)
> > #
> > #   #...more examples from the unit test
> >
> > Example of method documentation (from sequence.rb):
> >   # Usage:
> >   #    my_seq = Bio::Sequence('AGGCACGAT')
> >   #    my_na = my_seq.na
> >   # Function::   Converts the Bio::Sequence object into a
> > Bio::Sequence::NA object
> >   # Returns::    a Bio::Sequence::NA object
> >   # Arguments::  none
> >   def na
> >     @seq = NA.new(@seq)
> >     @moltype = NA
> >   end
> >
> > As the time I can work on this is only limited, expect to 
> see gradual 
> > additions to the cvs repository. Any other people wishing 
> to help out 
> > are greatly welcome!!
> >
> > Of course, I promise not to touch other people's code, unless they 
> > explicitely tell me to.
> >
> > Any thoughts/suggestions on this?
> >
> > Kind regards,
> >
> > Jan Aerts, PhD
> > Bioinformatics Group
> > Roslin Institute
> > Roslin, Scotland, UK
> > +33 131 527 4200
> >
> > ---------The obligatory disclaimer-------- The information 
> contained 
> > in this e-mail (including any attachments) is
> > confidential and is intended for the use of the addressee 
> only.   The
> > opinions expressed within this e-mail (including any 
> attachments) are 
> > the opinions of the sender and do not necessarily 
> constitute those of 
> > Roslin Institute (Edinburgh) ("the Institute") unless specifically 
> > stated by a sender who is duly authorised to do so on behalf of the 
> > Institute.
> >
> > _______________________________________________
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From rlr215 at nyu.edu  Mon Mar  6 15:14:06 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Mon, 6 Mar 2006 10:14:06 -0500
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu>


Hello again everyone!

>
> What do you think of using a standardized or (sound ugly:) formal
> format? Does your documentation include some of the
> synopsis/description/function/what it returns/arguments things? Do you
> think it is useful/feasible to put them in that format?

I think a reasonable standardization is a good thing, especially at the 
overview level of the class or module or whatever.  Here's an example 
of what I've been writing for method documentation:

(This is for subseq in Bio::Sequence::Common)

   # Returns a new sequence containing the subsequence identified by the 
start
   # and end numbers given as parameters.  *Important:* Biological 
sequence
   # numbering conventions (one-based) rather than ruby's (zero-based) 
numbering
   # conventions are used.
   #
   #   s = Bio::Sequence::Generic.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence 
identical
   # to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   #

So, I haven't been writing enormously formal specs - which seem like a 
bit of overkill for most of the methods, and rdoc takes care of the 
basics of argument lists.  Otherwise I note what to expect in return, 
or if the method does or does not modify the current object.  Also if 
there are any things that are dangerous or tricky...  I also give an 
example for all methods.

It seems to me, and this is surely open to discussion, that formalizing 
the individual method descriptions too much makes them enormously 
tedious to write - so much so that very few will ever get written.  
BUT, on the class or module level, I think a certain amount of 
formalization is good, so that the overviews are reasonably consistent.

Best,

-Ryan

>
> Thanks,
> jan.
>
>> -----Original Message-----
>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>> Sent: 06 March 2006 14:41
>> To: jan aerts (RI)
>> Subject: Re: [BioRuby] bioruby documentation
>>
>> Good Morning All,
>>
>> I've had similar toughts to Jan, and am a couple methods away
>> from completely documenting Bio::Sequence::* .  I was hoping
>> to send that in to Toshiaki later today.  I haven't yet
>> written a synopsis or description for them, mainly because I
>> was using the process of documenting all the methods as a way
>> of thoroughly understanding the use and structure of the
>> classes.  If the documentation I've currently written is seen
>> as reasonable and accepted, I would then add the overview
>> documentation for those classes and files.
>>
>> Is there somewhere we can note which parts different people
>> are working on documenting, so as to avoid any duplication of effort?
>>
>> Best!
>>
>> -Ryan
>>
>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>
>>> Hi all,
>>>
>>> Given the posts about bioruby documentation in the last few
>> months, my
>>> own experiences with bioruby and a bit of encouragement
>> from Toshiaki,
>>> I'd like to commence documenting bioruby classes (in CVS)
>> that are not
>>> documented yet, and to standardize the documentation format
>> for those
>>> that already have documentation.
>>>
>>> Documentation would take the form of rdoc, so that it would be
>>> browsable via the www.bioruby.org/rdoc website.
>>>
>>> Some guidelines that I would like to use in the documentation:
>>> (1) Each class should have a description and synopsis. If
>> there is a
>>> unit test at the bottom, this can easily be tweaked into a
>> synopsis.
>>> If such a unit test is available, 'documentating' would
>> mean (at least
>>> in the first round) 'tweaking and copying the unit test in
>> a comment
>>> in front of the class'. Alternatively, unit tests and documentation
>>> could be combined into one (as Ara and Pjotr discussed),
>> but I'm not
>>> experienced enough in ruby yet to do this in a simple,
>> transparent way.
>>> (2) Given the effort developers have put into writing the
>> classes, it
>>> would be nice if bioruby could reach as wide an audience as
>> possible.
>>> What I believe would help tremendously, is a standardized
>> format for
>>> documentation. By this I mean that the following
>> information is given
>>> for each method (sort of like in bioperl documentation):
>>>     * synopsis
>>>     * description
>>>     * function
>>>     * what it returns
>>>     * any arguments
>>> (3) It should be made clear to the user if a class should be used
>>> directly, or if it just supports other classes (e.g.
>>> Bio::Sequence::Format). Additional important info would be
>> interaction
>>> with other classes (e.g. "how does the sequence class interact with
>>> the embl class?"). Original module writers have an
>> important role in
>>> describing this context.
>>> (4) Encapsule the copyright information between '#--' and
>> '#++', as it
>>> distracts the user from what he/she wants to know. (It _is_
>> important,
>>> but not for the average user...)
>>>
>>>
>>> Example of class documentation (from sequence.rb):
>>> # = DESCRIPTION
>>> # The Bio::Sequence class generically describes a nucleic or amino
>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>> and Bio::Sequence::AA # # If possible, create sequence
>> objects using
>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>> Bio::Sequence # class will have to guess the type of
>> sequence you're
>>> talking about.
>>> #
>>> # = SYNOPSIS
>>> #   # Create a nucleic or amino acid sequence
>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>> #
>>> #   # Print it out
>>> #   puts dna.to_s
>>> #   puts aa.to_s
>>> #
>>> #   # Get a subsequence, bioinformatics style (first
>> nucleotide is '1')
>>> #   puts dna.subseq(2,6)
>>> #
>>> #   #...more examples from the unit test
>>>
>>> Example of method documentation (from sequence.rb):
>>>   # Usage:
>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>   #    my_na = my_seq.na
>>>   # Function::   Converts the Bio::Sequence object into a
>>> Bio::Sequence::NA object
>>>   # Returns::    a Bio::Sequence::NA object
>>>   # Arguments::  none
>>>   def na
>>>     @seq = NA.new(@seq)
>>>     @moltype = NA
>>>   end
>>>
>>> As the time I can work on this is only limited, expect to
>> see gradual
>>> additions to the cvs repository. Any other people wishing
>> to help out
>>> are greatly welcome!!
>>>
>>> Of course, I promise not to touch other people's code, unless they
>>> explicitely tell me to.
>>>
>>> Any thoughts/suggestions on this?
>>>
>>> Kind regards,
>>>
>>> Jan Aerts, PhD
>>> Bioinformatics Group
>>> Roslin Institute
>>> Roslin, Scotland, UK
>>> +33 131 527 4200
>>>
>>> ---------The obligatory disclaimer-------- The information
>> contained
>>> in this e-mail (including any attachments) is
>>> confidential and is intended for the use of the addressee
>> only.   The
>>> opinions expressed within this e-mail (including any
>> attachments) are
>>> the opinions of the sender and do not necessarily
>> constitute those of
>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>> stated by a sender who is duly authorised to do so on behalf of the
>>> Institute.
>>>
>>> _______________________________________________
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>>


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 18:21:53 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 18:21:53 -0000
Subject: [BioRuby] bioruby documentation
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>

Ryan,

Nice piece of doc. I completely agree that the level of formalization is entirely open to discussion. And I completely understand your concerns. But on the other hand, a formalized list of things to be described can, in my opinion, _help_ developers document their code, rather than it would keep them from doing that. You can see it as a checklist of things to document. In your piece of code, you describe several aspects of the subseq method, but for every new method you'd describe, you'd need to have this list of things in the back of your head that you have to mention ("did I mention that it returns itself?" "did I mention what the defaults for the arguments are", ...). If we would have this list accessible on the wiki for any developer, he/she could copy it into their code and fill it in like a checklist. I suspect that would make things much easier on the developer (but that's my own view, of course).

You're right that rdoc already takes care of argument lists, but it only lists them, instead of describing them. And in many instances, a bioruby user would have to know what the arguments actually are (including their defaults) without going into the code. Ergo: arguments should be documented.

What do you think?
jan


-----Original Message-----
From: Ryan Raaum [mailto:rlr215 at nyu.edu]
Sent: Mon 3/6/2006 3:14 PM
To: jan aerts (RI)
Cc: bioruby at open-bio.org
Subject: Re: [BioRuby] bioruby documentation
 

Hello again everyone!

>
> What do you think of using a standardized or (sound ugly:) formal
> format? Does your documentation include some of the
> synopsis/description/function/what it returns/arguments things? Do you
> think it is useful/feasible to put them in that format?

I think a reasonable standardization is a good thing, especially at the 
overview level of the class or module or whatever.  Here's an example 
of what I've been writing for method documentation:

(This is for subseq in Bio::Sequence::Common)

   # Returns a new sequence containing the subsequence identified by the 
start
   # and end numbers given as parameters.  *Important:* Biological 
sequence
   # numbering conventions (one-based) rather than ruby's (zero-based) 
numbering
   # conventions are used.
   #
   #   s = Bio::Sequence::Generic.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence 
identical
   # to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   #

So, I haven't been writing enormously formal specs - which seem like a 
bit of overkill for most of the methods, and rdoc takes care of the 
basics of argument lists.  Otherwise I note what to expect in return, 
or if the method does or does not modify the current object.  Also if 
there are any things that are dangerous or tricky...  I also give an 
example for all methods.

It seems to me, and this is surely open to discussion, that formalizing 
the individual method descriptions too much makes them enormously 
tedious to write - so much so that very few will ever get written.  
BUT, on the class or module level, I think a certain amount of 
formalization is good, so that the overviews are reasonably consistent.

Best,

-Ryan

>
> Thanks,
> jan.
>
>> -----Original Message-----
>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>> Sent: 06 March 2006 14:41
>> To: jan aerts (RI)
>> Subject: Re: [BioRuby] bioruby documentation
>>
>> Good Morning All,
>>
>> I've had similar toughts to Jan, and am a couple methods away
>> from completely documenting Bio::Sequence::* .  I was hoping
>> to send that in to Toshiaki later today.  I haven't yet
>> written a synopsis or description for them, mainly because I
>> was using the process of documenting all the methods as a way
>> of thoroughly understanding the use and structure of the
>> classes.  If the documentation I've currently written is seen
>> as reasonable and accepted, I would then add the overview
>> documentation for those classes and files.
>>
>> Is there somewhere we can note which parts different people
>> are working on documenting, so as to avoid any duplication of effort?
>>
>> Best!
>>
>> -Ryan
>>
>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>
>>> Hi all,
>>>
>>> Given the posts about bioruby documentation in the last few
>> months, my
>>> own experiences with bioruby and a bit of encouragement
>> from Toshiaki,
>>> I'd like to commence documenting bioruby classes (in CVS)
>> that are not
>>> documented yet, and to standardize the documentation format
>> for those
>>> that already have documentation.
>>>
>>> Documentation would take the form of rdoc, so that it would be
>>> browsable via the www.bioruby.org/rdoc website.
>>>
>>> Some guidelines that I would like to use in the documentation:
>>> (1) Each class should have a description and synopsis. If
>> there is a
>>> unit test at the bottom, this can easily be tweaked into a
>> synopsis.
>>> If such a unit test is available, 'documentating' would
>> mean (at least
>>> in the first round) 'tweaking and copying the unit test in
>> a comment
>>> in front of the class'. Alternatively, unit tests and documentation
>>> could be combined into one (as Ara and Pjotr discussed),
>> but I'm not
>>> experienced enough in ruby yet to do this in a simple,
>> transparent way.
>>> (2) Given the effort developers have put into writing the
>> classes, it
>>> would be nice if bioruby could reach as wide an audience as
>> possible.
>>> What I believe would help tremendously, is a standardized
>> format for
>>> documentation. By this I mean that the following
>> information is given
>>> for each method (sort of like in bioperl documentation):
>>>     * synopsis
>>>     * description
>>>     * function
>>>     * what it returns
>>>     * any arguments
>>> (3) It should be made clear to the user if a class should be used
>>> directly, or if it just supports other classes (e.g.
>>> Bio::Sequence::Format). Additional important info would be
>> interaction
>>> with other classes (e.g. "how does the sequence class interact with
>>> the embl class?"). Original module writers have an
>> important role in
>>> describing this context.
>>> (4) Encapsule the copyright information between '#--' and
>> '#++', as it
>>> distracts the user from what he/she wants to know. (It _is_
>> important,
>>> but not for the average user...)
>>>
>>>
>>> Example of class documentation (from sequence.rb):
>>> # = DESCRIPTION
>>> # The Bio::Sequence class generically describes a nucleic or amino
>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>> and Bio::Sequence::AA # # If possible, create sequence
>> objects using
>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>> Bio::Sequence # class will have to guess the type of
>> sequence you're
>>> talking about.
>>> #
>>> # = SYNOPSIS
>>> #   # Create a nucleic or amino acid sequence
>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>> #
>>> #   # Print it out
>>> #   puts dna.to_s
>>> #   puts aa.to_s
>>> #
>>> #   # Get a subsequence, bioinformatics style (first
>> nucleotide is '1')
>>> #   puts dna.subseq(2,6)
>>> #
>>> #   #...more examples from the unit test
>>>
>>> Example of method documentation (from sequence.rb):
>>>   # Usage:
>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>   #    my_na = my_seq.na
>>>   # Function::   Converts the Bio::Sequence object into a
>>> Bio::Sequence::NA object
>>>   # Returns::    a Bio::Sequence::NA object
>>>   # Arguments::  none
>>>   def na
>>>     @seq = NA.new(@seq)
>>>     @moltype = NA
>>>   end
>>>
>>> As the time I can work on this is only limited, expect to
>> see gradual
>>> additions to the cvs repository. Any other people wishing
>> to help out
>>> are greatly welcome!!
>>>
>>> Of course, I promise not to touch other people's code, unless they
>>> explicitely tell me to.
>>>
>>> Any thoughts/suggestions on this?
>>>
>>> Kind regards,
>>>
>>> Jan Aerts, PhD
>>> Bioinformatics Group
>>> Roslin Institute
>>> Roslin, Scotland, UK
>>> +33 131 527 4200
>>>
>>> ---------The obligatory disclaimer-------- The information
>> contained
>>> in this e-mail (including any attachments) is
>>> confidential and is intended for the use of the addressee
>> only.   The
>>> opinions expressed within this e-mail (including any
>> attachments) are
>>> the opinions of the sender and do not necessarily
>> constitute those of
>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>> stated by a sender who is duly authorised to do so on behalf of the
>>> Institute.
>>>
>>> _______________________________________________
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>>


From jan.aerts at bbsrc.ac.uk  Mon Mar  6 18:41:52 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Mon, 6 Mar 2006 18:41:52 -0000
Subject: [BioRuby] bioruby documentation
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
	<440C7A13.5070307@corevx.com>
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6615@rie2ksrv1.ri.bbsrc.ac.uk>

Hi Trevor,

I agree that, if at all, only public methods should be documented. And, as you say, one or two lines of comment become commonplace. However, we have to keep in mind that it's the end-user that is the target for the docs. Myself having used BioPerl a lot, I found the included method-docs almost always sufficient for using them. The fact that some BioPerl developers did not adequately supply information (be it in that formal format) probably means that they would not provide documentation at all if not for that standard.

If the consensus would be _not_ to document the methods, I'll of course go for that. What do the heavy-weights think?
jan.


-----Original Message-----
From: Trevor Wennblom [mailto:trevor at corevx.com]
Sent: Mon 3/6/2006 6:06 PM
To: jan aerts (RI); bioruby at open-bio.org
Subject: Re: [BioRuby] bioruby documentation
 
jan aerts (RI) wrote:
> (2) Given the effort developers have put into writing the classes, it
> would be nice if bioruby could reach as wide an audience as possible.
> What I believe would help tremendously, is a standardized format for
> documentation. By this I mean that the following information is given
> for each method (sort of like in bioperl documentation):
>     * synopsis
>     * description
>     * function
>     * what it returns
>     * any arguments
>   

Hi Jan,

Thanks for taking the initiative on this important subject!  Coming up 
with a standard for documenting the major classes and modules would be a 
great idea, I've tried my best on the components that I've written so far.

I'm going to agree with Ryan that documenting every method is likely 
overkill.  One of the beauties of Ruby is that one or two line methods 
become commonplace.  Often to read BioPerl code (where they do generally 
have every method formally documented) I strip out the comments since 
they dominate the code to such a degree as to be distracting, and the 
comments are often just there to meet spec but not provide useful 
information.  If we were to require documentation of methods I would say 
that it should only be required for public methods.


> (4) Encapsule the copyright information between '#--' and '#++', as it
> distracts the user from what he/she wants to know. (It _is_ important,
> but not for the average user...)
>   

We're switching to the Ruby license, correct?  Do we even need anything 
beyond "License:: Ruby"?

Thanks again,
Trevor


From rlr215 at nyu.edu  Mon Mar  6 18:46:12 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Mon, 6 Mar 2006 13:46:12 -0500
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <4db8cd9e5949c71fdba3ec26ffa389dd@nyu.edu>

Hi all (again!),

Putting the formalization into a more concrete perspective, compare:

an example from the bioperl docs:
http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html

and an example from the Ruby on Rails docs:
http://api.rubyonrails.org/classes/ActionController/Base.html

The bioperl example is very formalized, so it is true that nothing is 
left out.  However, it doesn't read very well and most of the method 
documentation ends up being highly repetitive:  (To caricature... :)

Title   : do_something
Usage   : Object.do_something
Function: does something
Returns : something
Args    : precursor to something

Whereas (in my mind), the rails documentation reads very well, simple 
methods are simply documented, complex methods are documented in 
detail.  If the arguments are absent or obvious, don't talk about them; 
if the arguments are tricky, do talk about them. And so on.  No one 
really *wants* to document, and if documenting is annoying (= overly 
formalized), no one will.

I think a consistent, relatively formalized overview is good, but that 
overly formalized method and attribute documentation guidelines 
ultimately mean that little to no documentation will get done because 
it's too annoying (in most real-world open source projects).

Best,

Ryan

On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote:

> Ryan,
>
> Nice piece of doc. I completely agree that the level of formalization 
> is entirely open to discussion. And I completely understand your 
> concerns. But on the other hand, a formalized list of things to be 
> described can, in my opinion, _help_ developers document their code, 
> rather than it would keep them from doing that. You can see it as a 
> checklist of things to document. In your piece of code, you describe 
> several aspects of the subseq method, but for every new method you'd 
> describe, you'd need to have this list of things in the back of your 
> head that you have to mention ("did I mention that it returns itself?" 
> "did I mention what the defaults for the arguments are", ...). If we 
> would have this list accessible on the wiki for any developer, he/she 
> could copy it into their code and fill it in like a checklist. I 
> suspect that would make things much easier on the developer (but 
> that's my own view, of course).
>
> You're right that rdoc already takes care of argument lists, but it 
> only lists them, instead of describing them. And in many instances, a 
> bioruby user would have to know what the arguments actually are 
> (including their defaults) without going into the code. Ergo: 
> arguments should be documented.
>
> What do you think?
> jan
>
>
> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> Sent: Mon 3/6/2006 3:14 PM
> To: jan aerts (RI)
> Cc: bioruby at open-bio.org
> Subject: Re: [BioRuby] bioruby documentation
>
>
> Hello again everyone!
>
>>
>> What do you think of using a standardized or (sound ugly:) formal
>> format? Does your documentation include some of the
>> synopsis/description/function/what it returns/arguments things? Do you
>> think it is useful/feasible to put them in that format?
>
> I think a reasonable standardization is a good thing, especially at the
> overview level of the class or module or whatever.  Here's an example
> of what I've been writing for method documentation:
>
> (This is for subseq in Bio::Sequence::Common)
>
>    # Returns a new sequence containing the subsequence identified by 
> the
> start
>    # and end numbers given as parameters.  *Important:* Biological
> sequence
>    # numbering conventions (one-based) rather than ruby's (zero-based)
> numbering
>    # conventions are used.
>    #
>    #   s = Bio::Sequence::Generic.new('atggaatga')
>    #   puts s.subseq(1,3)                      #=> "atg"
>    #
>    # Start defaults to 1 and end defaults to the entire existing 
> string,
> so
>    # subseq called without any parameters simply returns a new sequence
> identical
>    # to the existing sequence.
>    #
>    #   puts s.subseq                           #=> "atggaatga"
>    #
>
> So, I haven't been writing enormously formal specs - which seem like a
> bit of overkill for most of the methods, and rdoc takes care of the
> basics of argument lists.  Otherwise I note what to expect in return,
> or if the method does or does not modify the current object.  Also if
> there are any things that are dangerous or tricky...  I also give an
> example for all methods.
>
> It seems to me, and this is surely open to discussion, that formalizing
> the individual method descriptions too much makes them enormously
> tedious to write - so much so that very few will ever get written.
> BUT, on the class or module level, I think a certain amount of
> formalization is good, so that the overviews are reasonably consistent.
>
> Best,
>
> -Ryan
>
>>
>> Thanks,
>> jan.
>>
>>> -----Original Message-----
>>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
>>> Sent: 06 March 2006 14:41
>>> To: jan aerts (RI)
>>> Subject: Re: [BioRuby] bioruby documentation
>>>
>>> Good Morning All,
>>>
>>> I've had similar toughts to Jan, and am a couple methods away
>>> from completely documenting Bio::Sequence::* .  I was hoping
>>> to send that in to Toshiaki later today.  I haven't yet
>>> written a synopsis or description for them, mainly because I
>>> was using the process of documenting all the methods as a way
>>> of thoroughly understanding the use and structure of the
>>> classes.  If the documentation I've currently written is seen
>>> as reasonable and accepted, I would then add the overview
>>> documentation for those classes and files.
>>>
>>> Is there somewhere we can note which parts different people
>>> are working on documenting, so as to avoid any duplication of effort?
>>>
>>> Best!
>>>
>>> -Ryan
>>>
>>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
>>>
>>>> Hi all,
>>>>
>>>> Given the posts about bioruby documentation in the last few
>>> months, my
>>>> own experiences with bioruby and a bit of encouragement
>>> from Toshiaki,
>>>> I'd like to commence documenting bioruby classes (in CVS)
>>> that are not
>>>> documented yet, and to standardize the documentation format
>>> for those
>>>> that already have documentation.
>>>>
>>>> Documentation would take the form of rdoc, so that it would be
>>>> browsable via the www.bioruby.org/rdoc website.
>>>>
>>>> Some guidelines that I would like to use in the documentation:
>>>> (1) Each class should have a description and synopsis. If
>>> there is a
>>>> unit test at the bottom, this can easily be tweaked into a
>>> synopsis.
>>>> If such a unit test is available, 'documentating' would
>>> mean (at least
>>>> in the first round) 'tweaking and copying the unit test in
>>> a comment
>>>> in front of the class'. Alternatively, unit tests and documentation
>>>> could be combined into one (as Ara and Pjotr discussed),
>>> but I'm not
>>>> experienced enough in ruby yet to do this in a simple,
>>> transparent way.
>>>> (2) Given the effort developers have put into writing the
>>> classes, it
>>>> would be nice if bioruby could reach as wide an audience as
>>> possible.
>>>> What I believe would help tremendously, is a standardized
>>> format for
>>>> documentation. By this I mean that the following
>>> information is given
>>>> for each method (sort of like in bioperl documentation):
>>>>     * synopsis
>>>>     * description
>>>>     * function
>>>>     * what it returns
>>>>     * any arguments
>>>> (3) It should be made clear to the user if a class should be used
>>>> directly, or if it just supports other classes (e.g.
>>>> Bio::Sequence::Format). Additional important info would be
>>> interaction
>>>> with other classes (e.g. "how does the sequence class interact with
>>>> the embl class?"). Original module writers have an
>>> important role in
>>>> describing this context.
>>>> (4) Encapsule the copyright information between '#--' and
>>> '#++', as it
>>>> distracts the user from what he/she wants to know. (It _is_
>>> important,
>>>> but not for the average user...)
>>>>
>>>>
>>>> Example of class documentation (from sequence.rb):
>>>> # = DESCRIPTION
>>>> # The Bio::Sequence class generically describes a nucleic or amino
>>>> acid sequence and is a superclass of # Bio::Sequence::NA and
>>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence
>>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA
>>>> and Bio::Sequence::AA # # If possible, create sequence
>>> objects using
>>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the
>>>> Bio::Sequence # class will have to guess the type of
>>> sequence you're
>>>> talking about.
>>>> #
>>>> # = SYNOPSIS
>>>> #   # Create a nucleic or amino acid sequence
>>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
>>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
>>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
>>>> #
>>>> #   # Print it out
>>>> #   puts dna.to_s
>>>> #   puts aa.to_s
>>>> #
>>>> #   # Get a subsequence, bioinformatics style (first
>>> nucleotide is '1')
>>>> #   puts dna.subseq(2,6)
>>>> #
>>>> #   #...more examples from the unit test
>>>>
>>>> Example of method documentation (from sequence.rb):
>>>>   # Usage:
>>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
>>>>   #    my_na = my_seq.na
>>>>   # Function::   Converts the Bio::Sequence object into a
>>>> Bio::Sequence::NA object
>>>>   # Returns::    a Bio::Sequence::NA object
>>>>   # Arguments::  none
>>>>   def na
>>>>     @seq = NA.new(@seq)
>>>>     @moltype = NA
>>>>   end
>>>>
>>>> As the time I can work on this is only limited, expect to
>>> see gradual
>>>> additions to the cvs repository. Any other people wishing
>>> to help out
>>>> are greatly welcome!!
>>>>
>>>> Of course, I promise not to touch other people's code, unless they
>>>> explicitely tell me to.
>>>>
>>>> Any thoughts/suggestions on this?
>>>>
>>>> Kind regards,
>>>>
>>>> Jan Aerts, PhD
>>>> Bioinformatics Group
>>>> Roslin Institute
>>>> Roslin, Scotland, UK
>>>> +33 131 527 4200
>>>>
>>>> ---------The obligatory disclaimer-------- The information
>>> contained
>>>> in this e-mail (including any attachments) is
>>>> confidential and is intended for the use of the addressee
>>> only.   The
>>>> opinions expressed within this e-mail (including any
>>> attachments) are
>>>> the opinions of the sender and do not necessarily
>>> constitute those of
>>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>>>> stated by a sender who is duly authorised to do so on behalf of the
>>>> Institute.
>>>>
>>>> _______________________________________________
>>>> BioRuby mailing list
>>>> BioRuby at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>>
>>>
>
>


From trevor at corevx.com  Mon Mar  6 18:06:11 2006
From: trevor at corevx.com (Trevor Wennblom)
Date: Mon, 06 Mar 2006 12:06:11 -0600
Subject: [BioRuby] bioruby documentation
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <440C7A13.5070307@corevx.com>

jan aerts (RI) wrote:
> (2) Given the effort developers have put into writing the classes, it
> would be nice if bioruby could reach as wide an audience as possible.
> What I believe would help tremendously, is a standardized format for
> documentation. By this I mean that the following information is given
> for each method (sort of like in bioperl documentation):
>     * synopsis
>     * description
>     * function
>     * what it returns
>     * any arguments
>   

Hi Jan,

Thanks for taking the initiative on this important subject!  Coming up 
with a standard for documenting the major classes and modules would be a 
great idea, I've tried my best on the components that I've written so far.

I'm going to agree with Ryan that documenting every method is likely 
overkill.  One of the beauties of Ruby is that one or two line methods 
become commonplace.  Often to read BioPerl code (where they do generally 
have every method formally documented) I strip out the comments since 
they dominate the code to such a degree as to be distracting, and the 
comments are often just there to meet spec but not provide useful 
information.  If we were to require documentation of methods I would say 
that it should only be required for public methods.


> (4) Encapsule the copyright information between '#--' and '#++', as it
> distracts the user from what he/she wants to know. (It _is_ important,
> but not for the average user...)
>   

We're switching to the Ruby license, correct?  Do we even need anything 
beyond "License:: Ruby"?

Thanks again,
Trevor


From ktym at hgc.jp  Tue Mar  7 05:57:44 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 7 Mar 2006 14:57:44 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <440C7A13.5070307@corevx.com>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk>
	<440C7A13.5070307@corevx.com>
Message-ID: <EA0FD850-546B-400A-8D77-957DB9DCEED5@hgc.jp>

Hi,

Thanks for a lot of discussions.

* format of RDoc and level of detail - still need discussion

For readability in terminal, please fold docs within 79 columns
(example code in doc would break this principle).
Please use "#" prefixed style and don't use =begin rdoc/=end pairs
as it makes impossible to read the code without coloring.

Basically agreed to have standardized format as Jan suggested.
It will make clear what should be documented at least,
especially for the non-native developers.

Also agreed with Ryan's comparison

  - standardized format can be repetitive
  - simple methods are simply documented, complex methods
    are documented in detail

It looks ideal to have adequate dose of documentation and
it will also require some writing skill.  I'm really happy
if some of you could lead to fill BioRuby with nice level
of documentation.

* license - please change to Ruby's

Core Japanese developers are agreed to change license from LGPL to Ruby's
to make everyone who use Ruby can use BioRuby (re-writing of header is
not yet completed in some modules, though).

We need to ask other contributors to follow this change - ask their
permission that we can change the license whenever BioRuby staff needs.

* where and how to include data (enzyme.yaml for REBASE)

This is under discussion with Trevor but I think all discussions should
be done on this list to have audience (to tell the truth, reading/writing
English mails require time for me, so wants to share them without posting
summary in addition:).

Toshiaki


From jan.aerts at bbsrc.ac.uk  Tue Mar  7 11:01:26 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Tue, 7 Mar 2006 11:01:26 -0000
Subject: [BioRuby] bioruby documentation
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD4@rie2ksrv1.ri.bbsrc.ac.uk>

Good morning again.

If I understand correctly, the general feeling is that the class-level
docs should have a bit of standardized bit in them (a description, an
example) and that method-level docs should not be too elaborate if
that's not necessary. How about (given Toshiaki's comment of "[making]
clear what should be documented at least, especially for the non-native
developers") for _non-trivial_ methods giving a description, an example,
and the type of thing it returns? (These could or could not be nicely
put under separate headers; con: can look bloated, pro: speeds up
browsing if you want to know what a method returns). I think the
Bio::Sequence::Common#window_search is a nice example: tells you what
it's meant to do, gives an example, and says what it returns.

So what do you think of the following:
* standardized parts for class-level docs: description, example, and if
necessary: relationship to other classes
* for complex methods: use Bio::Sequence::Common#window_search as an
example (with or without little title thingies)
* for simple methods: use simple methods of the Rails
ActionController::Base as an example: just a one-line description

As rdoc takes care of listing arguments to a class: is there a way to
let it show automatically if an argument is mandatory or not?

jan.


> -----Original Message-----
> From: Ryan Raaum [mailto:rlr215 at nyu.edu] 
> Sent: 06 March 2006 18:46
> To: jan aerts (RI)
> Cc: bioruby at open-bio.org
> Subject: Re: [BioRuby] bioruby documentation
> 
> Hi all (again!),
> 
> Putting the formalization into a more concrete perspective, compare:
> 
> an example from the bioperl docs:
> http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html
> 
> and an example from the Ruby on Rails docs:
> http://api.rubyonrails.org/classes/ActionController/Base.html
> 
> The bioperl example is very formalized, so it is true that 
> nothing is left out.  However, it doesn't read very well and 
> most of the method documentation ends up being highly 
> repetitive:  (To caricature... :)
> 
> Title   : do_something
> Usage   : Object.do_something
> Function: does something
> Returns : something
> Args    : precursor to something
> 
> Whereas (in my mind), the rails documentation reads very 
> well, simple methods are simply documented, complex methods 
> are documented in detail.  If the arguments are absent or 
> obvious, don't talk about them; if the arguments are tricky, 
> do talk about them. And so on.  No one really *wants* to 
> document, and if documenting is annoying (= overly 
> formalized), no one will.
> 
> I think a consistent, relatively formalized overview is good, 
> but that overly formalized method and attribute documentation 
> guidelines ultimately mean that little to no documentation 
> will get done because it's too annoying (in most real-world 
> open source projects).
> 
> Best,
> 
> Ryan
> 
> On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote:
> 
> > Ryan,
> >
> > Nice piece of doc. I completely agree that the level of 
> formalization 
> > is entirely open to discussion. And I completely understand your 
> > concerns. But on the other hand, a formalized list of things to be 
> > described can, in my opinion, _help_ developers document 
> their code, 
> > rather than it would keep them from doing that. You can see it as a 
> > checklist of things to document. In your piece of code, you 
> describe 
> > several aspects of the subseq method, but for every new 
> method you'd 
> > describe, you'd need to have this list of things in the 
> back of your 
> > head that you have to mention ("did I mention that it 
> returns itself?"
> > "did I mention what the defaults for the arguments are", 
> ...). If we 
> > would have this list accessible on the wiki for any 
> developer, he/she 
> > could copy it into their code and fill it in like a checklist. I 
> > suspect that would make things much easier on the developer (but 
> > that's my own view, of course).
> >
> > You're right that rdoc already takes care of argument lists, but it 
> > only lists them, instead of describing them. And in many 
> instances, a 
> > bioruby user would have to know what the arguments actually are 
> > (including their defaults) without going into the code. Ergo:
> > arguments should be documented.
> >
> > What do you think?
> > jan
> >
> >
> > -----Original Message-----
> > From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> > Sent: Mon 3/6/2006 3:14 PM
> > To: jan aerts (RI)
> > Cc: bioruby at open-bio.org
> > Subject: Re: [BioRuby] bioruby documentation
> >
> >
> > Hello again everyone!
> >
> >>
> >> What do you think of using a standardized or (sound ugly:) formal 
> >> format? Does your documentation include some of the 
> >> synopsis/description/function/what it returns/arguments things? Do 
> >> you think it is useful/feasible to put them in that format?
> >
> > I think a reasonable standardization is a good thing, especially at 
> > the overview level of the class or module or whatever.  Here's an 
> > example of what I've been writing for method documentation:
> >
> > (This is for subseq in Bio::Sequence::Common)
> >
> >    # Returns a new sequence containing the subsequence 
> identified by 
> > the start
> >    # and end numbers given as parameters.  *Important:* Biological 
> > sequence
> >    # numbering conventions (one-based) rather than ruby's 
> (zero-based) 
> > numbering
> >    # conventions are used.
> >    #
> >    #   s = Bio::Sequence::Generic.new('atggaatga')
> >    #   puts s.subseq(1,3)                      #=> "atg"
> >    #
> >    # Start defaults to 1 and end defaults to the entire existing 
> > string, so
> >    # subseq called without any parameters simply returns a new 
> > sequence identical
> >    # to the existing sequence.
> >    #
> >    #   puts s.subseq                           #=> "atggaatga"
> >    #
> >
> > So, I haven't been writing enormously formal specs - which 
> seem like a 
> > bit of overkill for most of the methods, and rdoc takes care of the 
> > basics of argument lists.  Otherwise I note what to expect 
> in return, 
> > or if the method does or does not modify the current 
> object.  Also if 
> > there are any things that are dangerous or tricky...  I 
> also give an 
> > example for all methods.
> >
> > It seems to me, and this is surely open to discussion, that 
> > formalizing the individual method descriptions too much makes them 
> > enormously tedious to write - so much so that very few will 
> ever get written.
> > BUT, on the class or module level, I think a certain amount of 
> > formalization is good, so that the overviews are reasonably 
> consistent.
> >
> > Best,
> >
> > -Ryan
> >
> >>
> >> Thanks,
> >> jan.
> >>
> >>> -----Original Message-----
> >>> From: Ryan Raaum [mailto:rlr215 at nyu.edu]
> >>> Sent: 06 March 2006 14:41
> >>> To: jan aerts (RI)
> >>> Subject: Re: [BioRuby] bioruby documentation
> >>>
> >>> Good Morning All,
> >>>
> >>> I've had similar toughts to Jan, and am a couple methods 
> away from 
> >>> completely documenting Bio::Sequence::* .  I was hoping 
> to send that 
> >>> in to Toshiaki later today.  I haven't yet written a synopsis or 
> >>> description for them, mainly because I was using the process of 
> >>> documenting all the methods as a way of thoroughly 
> understanding the 
> >>> use and structure of the classes.  If the documentation I've 
> >>> currently written is seen as reasonable and accepted, I 
> would then 
> >>> add the overview documentation for those classes and files.
> >>>
> >>> Is there somewhere we can note which parts different people are 
> >>> working on documenting, so as to avoid any duplication of effort?
> >>>
> >>> Best!
> >>>
> >>> -Ryan
> >>>
> >>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> Given the posts about bioruby documentation in the last few
> >>> months, my
> >>>> own experiences with bioruby and a bit of encouragement
> >>> from Toshiaki,
> >>>> I'd like to commence documenting bioruby classes (in CVS)
> >>> that are not
> >>>> documented yet, and to standardize the documentation format
> >>> for those
> >>>> that already have documentation.
> >>>>
> >>>> Documentation would take the form of rdoc, so that it would be 
> >>>> browsable via the www.bioruby.org/rdoc website.
> >>>>
> >>>> Some guidelines that I would like to use in the documentation:
> >>>> (1) Each class should have a description and synopsis. If
> >>> there is a
> >>>> unit test at the bottom, this can easily be tweaked into a
> >>> synopsis.
> >>>> If such a unit test is available, 'documentating' would
> >>> mean (at least
> >>>> in the first round) 'tweaking and copying the unit test in
> >>> a comment
> >>>> in front of the class'. Alternatively, unit tests and 
> documentation 
> >>>> could be combined into one (as Ara and Pjotr discussed),
> >>> but I'm not
> >>>> experienced enough in ruby yet to do this in a simple,
> >>> transparent way.
> >>>> (2) Given the effort developers have put into writing the
> >>> classes, it
> >>>> would be nice if bioruby could reach as wide an audience as
> >>> possible.
> >>>> What I believe would help tremendously, is a standardized
> >>> format for
> >>>> documentation. By this I mean that the following
> >>> information is given
> >>>> for each method (sort of like in bioperl documentation):
> >>>>     * synopsis
> >>>>     * description
> >>>>     * function
> >>>>     * what it returns
> >>>>     * any arguments
> >>>> (3) It should be made clear to the user if a class 
> should be used 
> >>>> directly, or if it just supports other classes (e.g.
> >>>> Bio::Sequence::Format). Additional important info would be
> >>> interaction
> >>>> with other classes (e.g. "how does the sequence class 
> interact with 
> >>>> the embl class?"). Original module writers have an
> >>> important role in
> >>>> describing this context.
> >>>> (4) Encapsule the copyright information between '#--' and
> >>> '#++', as it
> >>>> distracts the user from what he/she wants to know. (It _is_
> >>> important,
> >>>> but not for the average user...)
> >>>>
> >>>>
> >>>> Example of class documentation (from sequence.rb):
> >>>> # = DESCRIPTION
> >>>> # The Bio::Sequence class generically describes a 
> nucleic or amino 
> >>>> acid sequence and is a superclass of # Bio::Sequence::NA and 
> >>>> Bio::Sequence::AA. Most methods that can be used on 
> Bio::Sequence 
> >>>> objects are described # in Bio::Sequence::Common, 
> Bio::Sequence::NA 
> >>>> and Bio::Sequence::AA # # If possible, create sequence
> >>> objects using
> >>>> the Bio::Sequence::NA or Bio::Sequence::AA classes 
> instead, as the 
> >>>> Bio::Sequence # class will have to guess the type of
> >>> sequence you're
> >>>> talking about.
> >>>> #
> >>>> # = SYNOPSIS
> >>>> #   # Create a nucleic or amino acid sequence
> >>>> #   dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA')
> >>>> #   rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa')
> >>>> #   aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU')
> >>>> #
> >>>> #   # Print it out
> >>>> #   puts dna.to_s
> >>>> #   puts aa.to_s
> >>>> #
> >>>> #   # Get a subsequence, bioinformatics style (first
> >>> nucleotide is '1')
> >>>> #   puts dna.subseq(2,6)
> >>>> #
> >>>> #   #...more examples from the unit test
> >>>>
> >>>> Example of method documentation (from sequence.rb):
> >>>>   # Usage:
> >>>>   #    my_seq = Bio::Sequence('AGGCACGAT')
> >>>>   #    my_na = my_seq.na
> >>>>   # Function::   Converts the Bio::Sequence object into a
> >>>> Bio::Sequence::NA object
> >>>>   # Returns::    a Bio::Sequence::NA object
> >>>>   # Arguments::  none
> >>>>   def na
> >>>>     @seq = NA.new(@seq)
> >>>>     @moltype = NA
> >>>>   end
> >>>>
> >>>> As the time I can work on this is only limited, expect to
> >>> see gradual
> >>>> additions to the cvs repository. Any other people wishing
> >>> to help out
> >>>> are greatly welcome!!
> >>>>
> >>>> Of course, I promise not to touch other people's code, 
> unless they 
> >>>> explicitely tell me to.
> >>>>
> >>>> Any thoughts/suggestions on this?
> >>>>
> >>>> Kind regards,
> >>>>
> >>>> Jan Aerts, PhD
> >>>> Bioinformatics Group
> >>>> Roslin Institute
> >>>> Roslin, Scotland, UK
> >>>> +33 131 527 4200
> >>>>
> >>>> ---------The obligatory disclaimer-------- The information
> >>> contained
> >>>> in this e-mail (including any attachments) is 
> confidential and is 
> >>>> intended for the use of the addressee
> >>> only.   The
> >>>> opinions expressed within this e-mail (including any
> >>> attachments) are
> >>>> the opinions of the sender and do not necessarily
> >>> constitute those of
> >>>> Roslin Institute (Edinburgh) ("the Institute") unless 
> specifically 
> >>>> stated by a sender who is duly authorised to do so on 
> behalf of the 
> >>>> Institute.
> >>>>
> >>>> _______________________________________________
> >>>> BioRuby mailing list
> >>>> BioRuby at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioruby
> >>>
> >>>
> >
> >
> 
> 


From ktym at hgc.jp  Tue Mar  7 13:38:42 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 7 Mar 2006 22:38:42 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
Message-ID: <895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>

Resend: as this message seems not delivered according to
http://open-bio.org/pipermail/bioruby/2006-March/date.html

-k


Ryan,

Thank you for your very nice doc.

On 2006/03/07, at 0:14, Ryan Raaum wrote:
>    #   s = Bio::Sequence::Generic.new('atggaatga')

If you will utilize this documentation, please change
this example to use Bio::Sequence::NA.

Bio::Sequence::Generic is just for developers - to hold
gaps, spaces etc. intact - mainly for multiple alignment.


In that sense, the following mail you send me personally
should be fixed as:


Begin forwarded message:
> From: Ryan Raaum <rlr215 at nyu.edu>
> Date: 2006?3?7? 0:47:32:JST
> To: Toshiaki Katayama <ktym at hgc.jp>
> Cc: jan aerts (RI) <jan.aerts at bbsrc.ac.uk>
> Subject: BioRuby Sequence Documentation Patch
>
> Hello,
>
> I have begun work on some documentation (as you may have seen from the messages on the mailing list just this morning).  Here is what I've done.  All methods and attributes in the Bio::Sequence hierarchy should be documented.  A summary for each file is yet to be written, but if this documentation is acceptable, I will write those after these are applied.  I made two small code changes as well:
>
> 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect.


The Bio::Sequence::Generic is not intended to strip.

In addition to this, why you made Bio::Sequence#guess method as :nodoc: ?
Seuence type guessing is not perfect so there need to be an interface
to change threshold etc.

Most of other parts seems to be acceptable - you really understood
the behind ideas!

On March 1, Jan also sent me a documented version of sequence.rb.
I'm sorry that I should post it on the list as soon as possible.

Anyway, could you contact him to merge your documentations?
If both of you are agreed, I'll commit your patch.


> 2. Modified the instance randomize method to start at length 0 IF a composition hash is given.  Otherwise, if there was an actual sequence AND a hash was given, odd things would happen.  (Of course, it was never meant to be called that way, but... as it CAN be called that way, I thought the behavior should be consistent.)


Thanks!


> I am happy to make modifications to this documentation,
>
> Best wishes,
>
> Ryan Raaum
>
-------------- next part --------------


From ktym at hgc.jp  Tue Mar  7 17:34:41 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Wed, 8 Mar 2006 02:34:41 +0900
Subject: [BioRuby] bioruby documentation
In-Reply-To: <5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>
	<5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
Message-ID: <AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>

Ryan,

Thank you for your quick fix.
I quote your mail as it was not posted to the list.

> Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function.

Thanks. This bug was introduced when I added length and index arguments...

Toshiaki


On 2006/03/08, at 0:56, Ryan Raaum wrote:

> Hi All,
>
>
>>
>> On 2006/03/07, at 0:14, Ryan Raaum wrote:
>>>    #   s = Bio::Sequence::Generic.new('atggaatga')
>>
>> If you will utilize this documentation, please change
>> this example to use Bio::Sequence::NA.
>>
>
> Done. (For this and all other examples using Bio::Sequence::Generic)
>
>> Bio::Sequence::Generic is just for developers - to hold
>> gaps, spaces etc. intact - mainly for multiple alignment.
>>
>
> Made Bio::Sequence::Generic :nodoc:
>
>>>
>>> 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect.
>>
>>
>> The Bio::Sequence::Generic is not intended to strip.
>
> Removed the added method.
>
>>
>> In addition to this, why you made Bio::Sequence#guess method as :nodoc: ?
>> Seuence type guessing is not perfect so there need to be an interface
>> to change threshold etc.
>
> Documented the guess methods.
>
>> On March 1, Jan also sent me a documented version of sequence.rb.
>> I'm sorry that I should post it on the list as soon as possible.
>>
>> Anyway, could you contact him to merge your documentations?
>> If both of you are agreed, I'll commit your patch.
>
> If Jan will send me his sequence.rb, I can merge it with mine and send the merged file back to Jan.  After he's edited the merge to his liking, we can put it all together and send it in as a unified patch.
>
>
> Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function.  In the line where the "total" is calculated, the original version used the length of the @seq as the starting length, but for the length and index parameters to work properly with the threshold value, the length of the guess string (`str` is the local method variable) is what should be the base length.
>
> Best,
>
> -Ryan
>


From rlr215 at nyu.edu  Wed Mar  8 15:04:18 2006
From: rlr215 at nyu.edu (Ryan Raaum)
Date: Wed, 8 Mar 2006 10:04:18 -0500
Subject: [BioRuby] Sequence Documentation Patch
In-Reply-To: <AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk>
	<3df70c206fe0aa94a66723916fe3aa83@nyu.edu>
	<895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp>
	<5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu>
	<AA281DAF-F110-4506-8739-E3AFDC8B7958@hgc.jp>
Message-ID: <d26de0101ebe43ea27a1177b3d18b49a@nyu.edu>

Good Morning,

Jan and I were able to reconcile our respective documentation attempts 
into a single documentation patch.  Here's an example of the final 
format:

(documentation for the Bio::Sequence::Common#subseq method)

   # Returns a new sequence containing the subsequence identified by the
   # start and end numbers given as parameters.  *Important:* Biological
   # sequence numbering conventions (one-based) rather than ruby's
   # (zero-based) numbering conventions are used.
   #
   #   s = Bio::Sequence::NA.new('atggaatga')
   #   puts s.subseq(1,3)                      #=> "atg"
   #
   # Start defaults to 1 and end defaults to the entire existing string, 
so
   # subseq called without any parameters simply returns a new sequence
   # identical to the existing sequence.
   #
   #   puts s.subseq                           #=> "atggaatga"
   # ---
   # *Arguments*:
   # * (optional) _s_(start): Integer (default 1)
   # * (optional) _e_(end): Integer (default current sequence length)
   # *Returns*:: new Bio::Sequence::NA/AA object

Hopefully this will be useful for new users.

Those changes from the first version of this patch that Toshiaki noted 
as being wrong or against the API were removed.

I also made two more bug fixes (in addition to those already described).

1. Added 'U' and 'u' to the bases counted towards the nucleic acid 
total in Bio::Sequence#guess.  (Without this, RNA sequences were 
"guessed" to be Amino Acid sequences).

2. Changed the arguments for method_missing in Bio::Sequence from 
(*arg) to (sym, *args, &block).  With this argument set, blocks will be 
properly passed through to the encapsulated object.

Cheers!

-Ryan

-------------- next part --------------


On Mar 7, 2006, at 12:34 PM, Toshiaki Katayama wrote:

> Ryan,
>
> Thank you for your quick fix.
> I quote your mail as it was not posted to the list.
>
>> Also, in the last round of editing, I made a small change to the 
>> Bio::Sequence#guess function.
>
> Thanks. This bug was introduced when I added length and index 
> arguments...
>
> Toshiaki
>
>
> On 2006/03/08, at 0:56, Ryan Raaum wrote:
>
>> Hi All,
>>
>>
>>>
>>> On 2006/03/07, at 0:14, Ryan Raaum wrote:
>>>>    #   s = Bio::Sequence::Generic.new('atggaatga')
>>>
>>> If you will utilize this documentation, please change
>>> this example to use Bio::Sequence::NA.
>>>
>>
>> Done. (For this and all other examples using Bio::Sequence::Generic)
>>
>>> Bio::Sequence::Generic is just for developers - to hold
>>> gaps, spaces etc. intact - mainly for multiple alignment.
>>>
>>
>> Made Bio::Sequence::Generic :nodoc:
>>
>>>>
>>>> 1. Added a whitespace stripping initialize method to 
>>>> Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA 
>>>> and Bio::Sequence::NA in that respect.
>>>
>>>
>>> The Bio::Sequence::Generic is not intended to strip.
>>
>> Removed the added method.
>>
>>>
>>> In addition to this, why you made Bio::Sequence#guess method as 
>>> :nodoc: ?
>>> Seuence type guessing is not perfect so there need to be an interface
>>> to change threshold etc.
>>
>> Documented the guess methods.
>>
>>> On March 1, Jan also sent me a documented version of sequence.rb.
>>> I'm sorry that I should post it on the list as soon as possible.
>>>
>>> Anyway, could you contact him to merge your documentations?
>>> If both of you are agreed, I'll commit your patch.
>>
>> If Jan will send me his sequence.rb, I can merge it with mine and 
>> send the merged file back to Jan.  After he's edited the merge to his 
>> liking, we can put it all together and send it in as a unified patch.
>>
>>
>> Also, in the last round of editing, I made a small change to the 
>> Bio::Sequence#guess function.  In the line where the "total" is 
>> calculated, the original version used the length of the @seq as the 
>> starting length, but for the length and index parameters to work 
>> properly with the threshold value, the length of the guess string 
>> (`str` is the local method variable) is what should be the base 
>> length.
>>
>> Best,
>>
>> -Ryan
>>
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From jan.aerts at bbsrc.ac.uk  Tue Mar 21 12:38:30 2006
From: jan.aerts at bbsrc.ac.uk (jan aerts (RI))
Date: Tue, 21 Mar 2006 12:38:30 -0000
Subject: [BioRuby] fastacmd.rb: iteration
Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>

Hi,

Could someone please have a look at the each_entry method of
io/fastacmd.rb (in cvs)? The code below gives the sequences of
'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
output. Any ideas?

  fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
  seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
  seqs.each do |seq|
    puts seq                        => works fine
  end

  fastacmd.each_entry do |fasta|
    puts 'hi'                       => it never seems to get here...
  end

Thanks,
Jan Aerts, PhD
Bioinformatics Group
Roslin Institute
Roslin, Scotland, UK
+44 131 527 4200

---------The obligatory disclaimer--------
The information contained in this e-mail (including any attachments) is
confidential and is intended for the use of the addressee only.   The
opinions expressed within this e-mail (including any attachments) are
the opinions of the sender and do not necessarily constitute those of
Roslin Institute (Edinburgh) ("the Institute") unless specifically
stated by a sender who is duly authorised to do so on behalf of the
Institute. 


From ngoto at gen-info.osaka-u.ac.jp  Wed Mar 22 10:29:27 2006
From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa)
Date: Wed, 22 Mar 2006 19:29:27 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
Message-ID: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>

Hi jan,

I found a bug in the Bio::FlatFile. Because io/fastacmd.rb
internally uses FlatFile, the bug may be related to the problem.

The bug is that IO#pos raises error when the IO object isn't
a regular file (e.g. pipe) but FlatFile always tried to get pos.
It is fixed in the CVS now.

On Tue, 21 Mar 2006 12:38:30 -0000
"jan aerts \(RI\)" <jan.aerts at bbsrc.ac.uk> wrote:

> Hi,
> 
> Could someone please have a look at the each_entry method of
> io/fastacmd.rb (in cvs)? The code below gives the sequences of
> 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
> output. Any ideas?
> 
>   fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
>   seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
>   seqs.each do |seq|
>     puts seq                        => works fine
>   end
> 
>   fastacmd.each_entry do |fasta|
>     puts 'hi'                       => it never seems to get here...
>   end
> 
> Thanks,
> Jan Aerts, PhD
> Bioinformatics Group
> Roslin Institute
> Roslin, Scotland, UK
> +44 131 527 4200
> 
> ---------The obligatory disclaimer--------
> The information contained in this e-mail (including any attachments) is
> confidential and is intended for the use of the addressee only.   The
> opinions expressed within this e-mail (including any attachments) are
> the opinions of the sender and do not necessarily constitute those of
> Roslin Institute (Edinburgh) ("the Institute") unless specifically
> stated by a sender who is duly authorised to do so on behalf of the
> Institute. 

-- 
Naohisa GOTO
ngoto at gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan


From k at bioruby.org  Sat Mar 25 08:01:05 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Sat, 25 Mar 2006 17:01:05 +0900
Subject: [BioRuby] Important news for bioruby developers
In-Reply-To: <FB2D7CB3-51AB-4AB7-B2F9-B807AF637674@sonsorol.org>
References: <FB2D7CB3-51AB-4AB7-B2F9-B807AF637674@sonsorol.org>
Message-ID: <0EA907FB-27BB-4A78-B173-D7F4F0AB4A85@bioruby.org>

Hi Chris,

Thank you for taking care of the server migration.

On 2006/03/22, at 1:32, Chris Dagdigian wrote:

> Hello,
>
> Sorry for the interruption but I've got some important site and server news. People will also see multiple copies of this note as I slowly transition sites over.
>
> We are in the midst of moving all of our websites, mailing lists, developers and sourcecode repositories onto more modern hardware located in a 2nd Boston area datacenter facility.
>
> This may not be a big deal for bioruby since your website, wiki and news site are not hosted by Open Bio.  Keep reading though as there are some questions/favors  I need to ask of the Ruby developers down below ...
>
> The transition is important for a couple of reasons - the most urgent being that we are going to lose internet connectivity in our current hosting facility on March 27th 2006.  That datacenter belongs to Wyeth Research in Cambridge, Massachusetts.  Wyeth Research & Genetics Institute have been long time significant supporters & hosting providers for OBF servers and projects -- we owe them a great deal of gratitude and public acknowledgment for hosting our servers over many years. Speaking as a hardware geek I can tell you that the many years of high-bandwidth, trouble free hosting have been invaluable for our efforts and projects.   Sadly, it is no longer possible for them to host our servers as they need to begin making some network and WAN circuit changes that will no longer support direct internet facing servers (such as ours) in Cambridge.
>
> The other major reason for the transition is our need to relocate onto hardware that can better be remotely managed (as our volunteer administrators are scattered all over the globe).
>
> My employer, BioTeam Inc. has donated new server hardware and is also providing the hosting facilities in a Tier 1 Boston area colocation facility. Infrastructure geeks can see pictures of the colocation  cage and the new OBF servers online at this URL:
> http://bioteam.net/gallery/bioteamBDC  -- those servers also host EMBOSS FTP/CVS and mailing lists.
>
> Current status of the migration:
>
>  - All 57 mailing lists have been moved over to the new hardware (you may have noticed "lists.open-bio.org" showing up in your list messages)
>
>  - The new anonymous sourcecode server is running at http://code.open-bio.org. "cvs.biodas,.org" is already pointing at it.
>
>  - Developers with CVS accounts have *NOT* been migrated yet
>
> Basically we are trying to relocate everything but the developers over the next few days so we can spend the weekend on the developer and CVS transition.
>
>
> Attention BioRuby Developers
> -----------------------------------------
>
> I need assistance with the following:
>
> (1) Please confirm to me or support at open-bio.org that you have NO websites running on Open Bio servers. It appears you host your own wiki/news/web sites


Yes, we have NO websites on Open Bio servers for now.


> (2) Please change your website front page to reflect the new URLs for your mailing lists:
>
>    http://lists.open-bio.org/mailman/listinfo/bioruby
>    http://lists.open-bio.org/mailman/listinfo/bioruby-cvs
>    http://lists.open-bio.org/mailman/listinfo/bioruby-ja


Done.


> (3) Please CNAME alias or web forward "cvs.bioruby.org"  to code.open-bio.org to use t


Does this mean "cvs.open-bio.org" is no longer available or not synced with "code.open-bio.org"?
Currently, we forward "http://cvs.bioruby.org" to "http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/?cvsroot=bioruby"


> (4) However you do your mail forwarding, please make sure that mail for bioruby.org mailing lists gets redirected to "lists.open-bio.org"


Done.


> For people with CVS commit/write access
> ---------------------------------------------------------
> Also note that when we finally do transition over to the new developer machine (where the real sourcecode lives), ALL developers will need to email support at open-bio.org to request a password reset. Although we can transition usernames, settings and home directories over from the old to the new machine we can not transition over existing passwords as they are stored in incompatible hashed formats. All developers are going to need new passwords for the new developer machine.  We will likely make the developer machine swap this weekend.
>
>
> Reporting Problems / Help & Assistance
> ------------------------------------------------------
> The transition will be complicated, we need your help to spot problems and glitches! The OBF has a new helpdesk ticketing system set up at "support at open-bio.org" so that all OBF admins can read and respond to issues and problems. Most troubles should be reported to that address. For urgent problems, especially during this transition period,  feel free to contact me directly (dag at sonsorol.org) (ichat/aol/aim screen name:  bioteamdag).
>
>
> Regards,
> Chris Dagdigian
> open-bio.org
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


From k at bioruby.org  Sun Mar 26 00:46:18 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Sun, 26 Mar 2006 09:46:18 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk>
	<200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
Message-ID: <D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>

Goto-san,

# post testing for new open-bio.org server :)

I've suggested to release 1.0.1 (or create a stable branch?)
How do you think?

This bug causes Bio::FlatFile with ARGF to fail at the last iteration
and it may be fairly serious problem for many users.

By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?

Toshiaki

--------------------------------------------------
% cat test_ff.rb
require 'bio'

ff = Bio::FlatFile.new(Bio::FastaFormat, ARGF)
ff.each do |e|
  p e.definition
end
% cat test.fa
>b0002
atgcgagtgtt
>b0003
atggttaaagt
>b0004
atgaaactcta
% ruby test_ff.rb test.fa
"b0002"
"b0003"
"b0004"
/usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos': no stream to tell (ArgumentError)
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:342:in `get_entry'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:573:in `next_entry'
        from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:609:in `each'
        from test_ff.rb:4
--------------------------------------------------

On 2006/03/22, at 19:29, GOTO Naohisa wrote:

> Hi jan,
>
> I found a bug in the Bio::FlatFile. Because io/fastacmd.rb
> internally uses FlatFile, the bug may be related to the problem.
>
> The bug is that IO#pos raises error when the IO object isn't
> a regular file (e.g. pipe) but FlatFile always tried to get pos.
> It is fixed in the CVS now.
>
> On Tue, 21 Mar 2006 12:38:30 -0000
> "jan aerts \(RI\)" <jan.aerts at bbsrc.ac.uk> wrote:
>
>> Hi,
>>
>> Could someone please have a look at the each_entry method of
>> io/fastacmd.rb (in cvs)? The code below gives the sequences of
>> 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no
>> output. Any ideas?
>>
>>   fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name")
>>   seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2'])
>>   seqs.each do |seq|
>>     puts seq                        => works fine
>>   end
>>
>>   fastacmd.each_entry do |fasta|
>>     puts 'hi'                       => it never seems to get here...
>>   end
>>
>> Thanks,
>> Jan Aerts, PhD
>> Bioinformatics Group
>> Roslin Institute
>> Roslin, Scotland, UK
>> +44 131 527 4200
>>
>> ---------The obligatory disclaimer--------
>> The information contained in this e-mail (including any attachments) is
>> confidential and is intended for the use of the addressee only.   The
>> opinions expressed within this e-mail (including any attachments) are
>> the opinions of the sender and do not necessarily constitute those of
>> Roslin Institute (Edinburgh) ("the Institute") unless specifically
>> stated by a sender who is duly authorised to do so on behalf of the
>> Institute. 
>
> -- 
> Naohisa GOTO
> ngoto at gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ktym at hgc.jp  Sun Mar 26 01:28:14 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Sun, 26 Mar 2006 10:28:14 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio command.rb,
	1.3, 1.4)
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
Message-ID: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>

Goto-san,

> +   # Same as OpenURI.open_uri(*arg).
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used because
> +   # open-uri.rb redefines Kernel.open.

Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
searching a particular method from it etc...

I don't understand what the complicated part of your Sandbox module actually does
(or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
something like

> require 'open-uri'
>
> module Kernel
>   private
>   alias open_uri open
>   alias open open_uri_original_open
> end

isn't enough?


Regards,
Toshiaki Katayama


-----test_open_uri.rb
#!/usr/bin/env ruby

require 'open-uri'

module Kernel
  private
  alias open_uri open
  alias open open_uri_original_open
end

url = "http://bioruby.org"

p "########## open_uri"
open_uri(url) do |f|
  puts f.read
end

p "########## open"
open(url) do |f|
  puts f.read
end


Begin forwarded message:

> From: Naohisa Goto <ngoto at pub.open-bio.org>
> Date: 2006?3?20? 19:34:59:JST
> To: bioruby-cvs at portal.open-bio.org
> Subject: [BioRuby-cvs] bioruby/lib/bio command.rb,1.3,1.4
>
> Update of /home/repository/bioruby/bioruby/lib/bio
> In directory pub.open-bio.org:/tmp/cvs-serv30042/lib/bio
>
> Modified Files:
> 	command.rb 
> Log Message:
> * New module Bio::Command::NetTools for miscellaneous network methods.
>   Currently, this module is intended to be used only inside
>   BioRuby library. Please do not use it in user's programs now.
> * New methods: Bio::Command::NetTools.open_uri(uri, *arg) and
>   Bio::Command::NetTools.read_uri(uri).
> * Changed license to Ruby's.
>
>
> Index: command.rb
> ===================================================================
> RCS file: /home/repository/bioruby/bioruby/lib/bio/command.rb,v
> retrieving revision 1.3
> retrieving revision 1.4
> diff -C2 -d -r1.3 -r1.4
> *** command.rb	4 Nov 2005 17:36:00 -0000	1.3
> --- command.rb	20 Mar 2006 10:34:57 -0000	1.4
> ***************
> *** 2,32 ****
>   # = bio/command.rb - general methods for external command execution
>   #
> ! # Copyright::	Copyright (C) 2003-2005
>   # 		Naohisa Goto <ng at bioruby.org>,
>   #		Toshiaki Katayama <k at bioruby.org>
> ! # License::	LGPL
>   #
>   #  $Id$
>   #
> - #--
> - #
> - #  This library is free software; you can redistribute it and/or
> - #  modify it under the terms of the GNU Lesser General Public
> - #  License as published by the Free Software Foundation; either
> - #  version 2 of the License, or (at your option) any later version.
> - #
> - #  This library is distributed in the hope that it will be useful,
> - #  but WITHOUT ANY WARRANTY; without even the implied warranty of
> - #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> - #  Lesser General Public License for more details.
> - #
> - #  You should have received a copy of the GNU Lesser General Public
> - #  License along with this library; if not, write to the Free Software
> - #  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307  USA
> - #
> - #++
> - #
>
>   require 'open3'
>
>   module Bio
> --- 2,15 ----
>   # = bio/command.rb - general methods for external command execution
>   #
> ! # Copyright::	Copyright (C) 2003-2006
>   # 		Naohisa Goto <ng at bioruby.org>,
>   #		Toshiaki Katayama <k at bioruby.org>
> ! # License::	Ruby's
>   #
>   #  $Id$
>   #
>
>   require 'open3'
> + require 'uri'
>
>   module Bio
> ***************
> *** 162,165 ****
> --- 145,291 ----
>
>   end # module Tools
> + 
> + 
> + # = Bio::Command::NetTools
> + #
> + # Bio::Command::NetTools is a collection of miscellaneous methods
> + # for data transport through network.
> + #
> + # Library internal use only. Users should not directly use it.
> + #
> + # Note that it is under construction.
> + module NetTools
> + 
> +   # Same as OpenURI.open_uri(*arg).
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used because
> +   # open-uri.rb redefines Kernel.open.
> +   def self.open_uri(uri, *arg)
> +     if defined? ::OpenURI
> +       ::OpenURI.open_uri(uri, *arg)
> +     else
> +       SandBox.load_openuri_in_sandbox
> +       uri = uri.to_s if ::URI::Generic === uri
> +       SandBox::OpenURI.open_uri(uri, *arg)
> +     end
> +   end
> + 
> +   # Same as OpenURI.open_uri(uri).read.
> +   # If open-uri.rb is already loaded, ::OpenURI is used.
> +   # Otherwise, internal OpenURI in sandbox is used becase
> +   # open-uri.rb redefines Kernel.open.
> +   def self.read_uri(uri)
> +     self.open_uri(uri).read
> +   end
> + 
> +   # Sandbox to load open-uri.rb.
> +   # Internal use only.
> +   module SandBox #:nodoc:
> + 
> +     # Dummy module definition.
> +     module Kernel #:nodoc:
> +       # dummy method
> +       def open(*arg); end #:nodoc:
> +     end #module Kernel
> +     
> +     # a method to find proxy. dummy definition
> +     module FindProxy; end #:nodoc:
> +     
> +     # dummy module definition
> +     module OpenURI #:nodoc:
> +       module OpenRead; end #:nodoc:
> +     end #module OpenURI
> +     
> +     # Dummy module definition.
> +     module URI #:nodoc:
> +       class Generic < ::URI::Generic #:nodoc:
> +         include SandBox::FindProxy
> +       end
> +       
> +       class HTTPS < ::URI::HTTPS #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       class HTTP  < ::URI::HTTP  #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       class FTP  < ::URI::FTP    #:nodoc:
> +         include SandBox::FindProxy
> +         include SandBox::OpenURI::OpenRead
> +       end
> +       
> +       # parse and new. internal use only.
> +       def self.__parse_and_new__(klass, uri) #:nodoc:
> +         scheme, userinfo, host, port,
> +         registry, path, opaque, query, fragment = ::URI.split(uri)
> +         klass.new(scheme, userinfo, host, port,
> +                   registry, path, opaque, query,
> +                   fragment)
> +       end
> +       private_class_method :__parse_and_new__
> +       
> +       # same as ::URI.parse. internal use only.
> +       def self.parse(uri) #:nodoc:
> +         r = ::URI.parse(uri)
> +         case r
> +         when ::URI::HTTPS
> +           __parse_and_new__(HTTPS, uri)
> +         when ::URI::HTTP
> +           __parse_and_new__(HTTP, uri)
> +         when ::URI::FTP
> +           __parse_and_new__(FTP, uri)
> +         else
> +           r
> +         end
> +       end
> +     end #module URI
> +     
> +     @load_openuri = nil
> +     # load open-uri.rb in SandBox module.
> +     def self.load_openuri_in_sandbox #:nodoc:
> +       return if @load_openuri
> +       fn = nil
> +       unless $:.find do |x|
> +           fn = File.join(x, 'open-uri.rb')
> +           FileTest.exist?(fn)
> +         end then
> +         warn('Warning: cannot find open-uri.rb in $LOAD_PATH')
> +       else
> +         # reading open-uri.rb
> +         str = File.read(fn)
> +         # eval open-uri.rb contents in SandBox module
> +         module_eval(str)
> +         
> +         # finds 'find_proxy' method
> +         find_proxy_lines = nil
> +         flag = nil
> +         endstr = nil
> +         str.each do |line|
> +           if flag then
> +             find_proxy_lines << line
> +             if endstr == line[0, endstr.length] and
> +                 /^\s+end(\s+.*)?$/ =~ line then
> +               break
> +             end
> +           elsif /^(\s+)def\s+find_proxy(\s+.*)?$/ =~ line then
> +             flag = true
> +             endstr = "#{$1}end"
> +             find_proxy_lines = line 
> +           end
> +         end
> +         if find_proxy_lines
> +           module_eval("module FindProxy;\n#{find_proxy_lines}\n;end\n")
> +         else
> +           warn('Warning: cannot find find_proxy method in open-uri.rb.')
> +         end
> +         @load_openuri = true
> +       end
> +     end
> +   end #module SandBox
> + end #module NetTools
> + 
>   end # module Command
>   end # module Bio
>
> _______________________________________________
> bioruby-cvs mailing list
> bioruby-cvs at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby-cvs


From ngoto at gen-info.osaka-u.ac.jp  Sun Mar 26 06:13:28 2006
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 26 Mar 2006 15:13:28 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
References: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
	<D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
Message-ID: <20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>

Hi,

> I've suggested to release 1.0.1 (or create a stable branch?)
> How do you think?

I agree.

> By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
> Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?

I referred specifications of Ruby's File, IO and Dir classes.
File.open, IO.open, and Dir.open can accept a block but
File.new, IO.new, and Dir.new don't.
Because Ruby's experts have determined such specifications,
I suppose that there may be something merits not to accept blocks
or there may be something problems to accept a block,
but I don't know much about them.

[ruby-list:24986] said that in Ruby 1.6.0, IO.new and Dir.new
was changed not to take block, but I can't find the reason.
( http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/24986 )

-- 
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp


From ngoto at gen-info.osaka-u.ac.jp  Sun Mar 26 06:56:07 2006
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto)
Date: Sun, 26 Mar 2006 15:56:07 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio
	command.rb, 1.3, 1.4)
In-Reply-To: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
	<29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
Message-ID: <20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>

> > +   # Same as OpenURI.open_uri(*arg).
> > +   # If open-uri.rb is already loaded, ::OpenURI is used.
> > +   # Otherwise, internal OpenURI in sandbox is used because
> > +   # open-uri.rb redefines Kernel.open.
> 
> Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
> searching a particular method from it etc...

Yes, it's very complicated. It's easier to copy-and-pastepart of
open-uri.rb but this may cause copyright problem.

The easiest way is to write "Please be careful that BioRuby now
require open-uri and the bahavior of open() is changed."
in the documents of BioRuby and uses OpenURI.open_uri.

In very rare case, the require "oprn-uri" changes behaviors.
For example,

% mkdir -p http://www.google.com
% echo "hello world" > http://www.google.com/index.html
% ls -R
.:
http:/

./http::
www.google.com/

./http:/www.google.com:
index.html
% irb
irb(main):001:0> open("http://www.google.com/index.html"){|f| f.read }
=> "hello world\n"
irb(main):002:0> require "open-uri"
=> true
irb(main):003:0> open("http://www.google.com/index.html"){|f| f.read }
=> "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=S
hift_JIS\"><title>Google</title><style><!--\nbody,td,a,p,.h{font-family:;}\n.h{f
ont-size: 20px;}\n.q{color:#0000cc;}\n//-->\n</style>\n<script>\n<!--\nfunction 
(snip)

However, I think this is very rare case. In addition, to open a local
file, it is recommended using File.open, because Kernel#open
accepts shell special characters such as "|rm -rf *".
(For the same reason, it is recommended to use OpenURI.open_uri
to open a URI.)

> I don't understand what the complicated part of your Sandbox module actually does
> (or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
> something like
> 
> > require 'open-uri'
> >
> > module Kernel
> >   private
> >   alias open_uri open
> >   alias open open_uri_original_open
> > end
> 
> isn't enough?

No. Above code kills open_uri's extension to the Kernel#open,
and users who want to use the extended Kernel#open will be failed.

--
Naohisa GOTO
ngoto at gen-info.osaka-u.ac.jp
Department of Genome Informatics, Genome Information Research Center,
Research Institute for Microbial Diseases, Osaka University, Japan


From k at bioruby.org  Mon Mar 27 07:17:19 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Mon, 27 Mar 2006 16:17:19 +0900
Subject: [BioRuby] fastacmd.rb: iteration
In-Reply-To: <20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>
References: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp>
	<D73D8D56-414B-4591-8E0C-4C372C933143@bioruby.org>
	<20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp>
Message-ID: <538B4544-9F77-4D2F-8A80-A64EB3728439@bioruby.org>

On 2006/03/26, at 15:13, Naohisa Goto wrote:
>> I've suggested to release 1.0.1 (or create a stable branch?)
>> How do you think?
>
> I agree.

OK, let's prepare for the next release within a week (hopefully).
Need to determine - create a branch or archive current HEAD.
Are there any developer that your code in CVS HEAD is not ready for release?

>> By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but
>> Bio::FlatFile.new doesn't.  Is there any reason to disallow the feature?
>
> I referred specifications of Ruby's File, IO and Dir classes.
> File.open, IO.open, and Dir.open can accept a block but
> File.new, IO.new, and Dir.new don't.

I understand. Thank you!

Toshiaki


From ktym at hgc.jp  Mon Mar 27 07:09:11 2006
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Mon, 27 Mar 2006 16:09:11 +0900
Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio
	command.rb, 1.3, 1.4)
In-Reply-To: <20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>
References: <200603201035.k2KAYxVL030067@pub.open-bio.org>
	<29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp>
	<20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp>
Message-ID: <79459264-E988-45F4-8809-53B52044D81D@hgc.jp>

Hmm, so my understanding is

* open-uri.rb sucks - doesn't allow to require w/o override Kernel#open
* but Kernel#open also sucks - use File.open instead

Thus, by KISS principle, I want to take the easiest way :)

* require 'open-uri' (in lib/bio.rb?) and add documentation about that.
* always use OpenURI.open_uri instead of Kernel#open or Net::HTTP#get

This is mainly for easier setup of the HTTP proxy (than 'net/http').

Toshiaki


On 2006/03/26, at 15:56, Naohisa Goto wrote:

>>> +   # Same as OpenURI.open_uri(*arg).
>>> +   # If open-uri.rb is already loaded, ::OpenURI is used.
>>> +   # Otherwise, internal OpenURI in sandbox is used because
>>> +   # open-uri.rb redefines Kernel.open.
>>
>> Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path,
>> searching a particular method from it etc...
>
> Yes, it's very complicated. It's easier to copy-and-pastepart of
> open-uri.rb but this may cause copyright problem.
>
> The easiest way is to write "Please be careful that BioRuby now
> require open-uri and the bahavior of open() is changed."
> in the documents of BioRuby and uses OpenURI.open_uri.
>
> In very rare case, the require "oprn-uri" changes behaviors.
> For example,
>
> % mkdir -p http://www.google.com
> % echo "hello world" > http://www.google.com/index.html
> % ls -R
> .:
> http:/
>
> ./http::
> www.google.com/
>
> ./http:/www.google.com:
> index.html
> % irb
> irb(main):001:0> open("http://www.google.com/index.html"){|f| f.read }
> => "hello world\n"
> irb(main):002:0> require "open-uri"
> => true
> irb(main):003:0> open("http://www.google.com/index.html"){|f| f.read }
> => "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=S
> hift_JIS\"><title>Google</title><style><!--\nbody,td,a,p,.h{font-family:;}\n.h{f
> ont-size: 20px;}\n.q{color:#0000cc;}\n//-->\n</style>\n<script>\n<!--\nfunction 
> (snip)
>
> However, I think this is very rare case. In addition, to open a local
> file, it is recommended using File.open, because Kernel#open
> accepts shell special characters such as "|rm -rf *".
> (For the same reason, it is recommended to use OpenURI.open_uri
> to open a URI.)
>
>> I don't understand what the complicated part of your Sandbox module actually does
>> (or intends), but if your purpose is just to avoid redefine of Kenel.open, to put
>> something like
>>
>>> require 'open-uri'
>>>
>>> module Kernel
>>>   private
>>>   alias open_uri open
>>>   alias open open_uri_original_open
>>> end
>>
>> isn't enough?
>
> No. Above code kills open_uri's extension to the Kernel#open,
> and users who want to use the extended Kernel#open will be failed.
>
> --
> Naohisa GOTO
> ngoto at gen-info.osaka-u.ac.jp
> Department of Genome Informatics, Genome Information Research Center,
> Research Institute for Microbial Diseases, Osaka University, Japan
>
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From k at bioruby.org  Wed Mar 29 01:37:09 2006
From: k at bioruby.org (Toshiaki Katayama)
Date: Wed, 29 Mar 2006 10:37:09 +0900
Subject: [BioRuby] Fwd: [Open-bio-l] Announcing BOSC 2006
References: <44294B65.4050207@duke.edu>
Message-ID: <DEF295C3-929B-4CF4-ADF1-E3ED07C60E77@bioruby.org>

I'll forward the announcement of BOSC 2006 -
as bioruby at lists.open-bio.org rejects postings from non-member.

Regards,
Toshiaki

Begin forwarded message:

> From: Darin London <darin.london at duke.edu>
> Date: 2006?3?28? 23:42:45:JST
> To: Authors at lists.open-bio.org, BioBiz at lists.open-bio.org, Biocorba-announce-l at lists.open-bio.org, Biocorba-l at lists.open-bio.org, Biograph at lists.open-bio.org, bioinfo-core at lists.open-bio.org, biojava-dev at lists.open-bio.org, Biojava-l at lists.open-bio.org, bioped-l at lists.open-bio.org, Bioperl-announce-l at lists.open-bio.org, Bioperl-l at lists.open-bio.org, bioperl-microarray at lists.open-bio.org, bioperl-pipeline at lists.open-bio.org, BioPython at lists.open-bio.org, BioPython-announce at lists.open-bio.org, Biopython-dev at lists.open-bio.org, BioRuby at lists.open-bio.org, BioRuby-ja at lists.open-bio.org, Biosoap-l at lists.open-bio.org, BioSQL-l at lists.open-bio.org, BP-announce at lists.open-bio.org, DAS at lists.open-bio.org, DAS-announce at lists.open-bio.org, DAS2 at lists.open-bio.org, Dynamite at lists.open-bio.org, EMBOSS at lists.open-bio.org, emboss-announce at lists.open-bio.org, emboss-dev at lists.open-bio.org, Moby-announce at lists.open-bio.org, MOBY-dev at lists.open-bio.org, moby-l at lists.open-bio.org, obf-developers at lists.open-bio.org, Ontologies at lists.open-bio.org, Open-bio-announce at lists.open-bio.org, Open-Bio-l at lists.open-bio.org, Open-Bioinformatics-Foundation at lists.open-bio.org
> Subject: [Open-bio-l] Announcing BOSC 2006
> Reply-To: bosc at open-bio.org
>
> MEETING ANNOUNCEMENT & CALL FOR SPEAKERS
>
> The 7th annual Bioinformatics Open Source Conference (BOSC 2006) is
> organized by the
> not-for-profit Open Bioinformatics Foundation. The meeting will take place
> Aug 4,5th in Fortaleza, Brasil, and is one of several Special Interest
> Group (SIG) meetings occurring in conjunction with the 14th
> International Conference
> on Intelligent Systems for Molecular Biology.  Please consult The Official
> BOSC 2006 Website at
>
> http://www.open-bio.org/wiki/BOSC_2006
>
> for details and information. 
>
> In addition, a BOSC weblog has been setup to make it easier to
> desiminate all BOSC
> related announcements:
>
> http://wiki.open-bio.org/boscblog/
>
> And if you have an ICAL compatible Calendar, there is an EventDB calendar
> set up with all BOSC related deadlines.
>
> http://eventful.com/groups/G0-001-000014747-0
>
> More information about ISMB can be found at the Official
> ISMB 2006 Website:
>
> http://ismb2006.cbi.cnptia.embrapa.br/
>
>
> Thank You, and we look forward to seeing you all,
> The BOSC Organizing Committee.
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l