From jan.aerts at bbsrc.ac.uk Mon Mar 6 09:21:54 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 6 Mar 2006 14:21:54 -0000 Subject: [BioRuby] bioruby documentation Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> Hi all, Given the posts about bioruby documentation in the last few months, my own experiences with bioruby and a bit of encouragement from Toshiaki, I'd like to commence documenting bioruby classes (in CVS) that are not documented yet, and to standardize the documentation format for those that already have documentation. Documentation would take the form of rdoc, so that it would be browsable via the www.bioruby.org/rdoc website. Some guidelines that I would like to use in the documentation: (1) Each class should have a description and synopsis. If there is a unit test at the bottom, this can easily be tweaked into a synopsis. If such a unit test is available, 'documentating' would mean (at least in the first round) 'tweaking and copying the unit test in a comment in front of the class'. Alternatively, unit tests and documentation could be combined into one (as Ara and Pjotr discussed), but I'm not experienced enough in ruby yet to do this in a simple, transparent way. (2) Given the effort developers have put into writing the classes, it would be nice if bioruby could reach as wide an audience as possible. What I believe would help tremendously, is a standardized format for documentation. By this I mean that the following information is given for each method (sort of like in bioperl documentation): * synopsis * description * function * what it returns * any arguments (3) It should be made clear to the user if a class should be used directly, or if it just supports other classes (e.g. Bio::Sequence::Format). Additional important info would be interaction with other classes (e.g. "how does the sequence class interact with the embl class?"). Original module writers have an important role in describing this context. (4) Encapsule the copyright information between '#--' and '#++', as it distracts the user from what he/she wants to know. (It _is_ important, but not for the average user...) Example of class documentation (from sequence.rb): # = DESCRIPTION # The Bio::Sequence class generically describes a nucleic or amino acid sequence and is a superclass of # Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used on Bio::Sequence objects are described # in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA # # If possible, create sequence objects using the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the Bio::Sequence # class will have to guess the type of sequence you're talking about. # # = SYNOPSIS # # Create a nucleic or amino acid sequence # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') # # # Print it out # puts dna.to_s # puts aa.to_s # # # Get a subsequence, bioinformatics style (first nucleotide is '1') # puts dna.subseq(2,6) # # #...more examples from the unit test Example of method documentation (from sequence.rb): # Usage: # my_seq = Bio::Sequence('AGGCACGAT') # my_na = my_seq.na # Function:: Converts the Bio::Sequence object into a Bio::Sequence::NA object # Returns:: a Bio::Sequence::NA object # Arguments:: none def na @seq = NA.new(@seq) @moltype = NA end As the time I can work on this is only limited, expect to see gradual additions to the cvs repository. Any other people wishing to help out are greatly welcome!! Of course, I promise not to touch other people's code, unless they explicitely tell me to. Any thoughts/suggestions on this? Kind regards, Jan Aerts, PhD Bioinformatics Group Roslin Institute Roslin, Scotland, UK +33 131 527 4200 ---------The obligatory disclaimer-------- The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From jan.aerts at bbsrc.ac.uk Mon Mar 6 09:21:54 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 6 Mar 2006 14:21:54 -0000 Subject: [BioRuby] bioruby documentation Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> Hi all, Given the posts about bioruby documentation in the last few months, my own experiences with bioruby and a bit of encouragement from Toshiaki, I'd like to commence documenting bioruby classes (in CVS) that are not documented yet, and to standardize the documentation format for those that already have documentation. Documentation would take the form of rdoc, so that it would be browsable via the www.bioruby.org/rdoc website. Some guidelines that I would like to use in the documentation: (1) Each class should have a description and synopsis. If there is a unit test at the bottom, this can easily be tweaked into a synopsis. If such a unit test is available, 'documentating' would mean (at least in the first round) 'tweaking and copying the unit test in a comment in front of the class'. Alternatively, unit tests and documentation could be combined into one (as Ara and Pjotr discussed), but I'm not experienced enough in ruby yet to do this in a simple, transparent way. (2) Given the effort developers have put into writing the classes, it would be nice if bioruby could reach as wide an audience as possible. What I believe would help tremendously, is a standardized format for documentation. By this I mean that the following information is given for each method (sort of like in bioperl documentation): * synopsis * description * function * what it returns * any arguments (3) It should be made clear to the user if a class should be used directly, or if it just supports other classes (e.g. Bio::Sequence::Format). Additional important info would be interaction with other classes (e.g. "how does the sequence class interact with the embl class?"). Original module writers have an important role in describing this context. (4) Encapsule the copyright information between '#--' and '#++', as it distracts the user from what he/she wants to know. (It _is_ important, but not for the average user...) Example of class documentation (from sequence.rb): # = DESCRIPTION # The Bio::Sequence class generically describes a nucleic or amino acid sequence and is a superclass of # Bio::Sequence::NA and Bio::Sequence::AA. Most methods that can be used on Bio::Sequence objects are described # in Bio::Sequence::Common, Bio::Sequence::NA and Bio::Sequence::AA # # If possible, create sequence objects using the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the Bio::Sequence # class will have to guess the type of sequence you're talking about. # # = SYNOPSIS # # Create a nucleic or amino acid sequence # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') # # # Print it out # puts dna.to_s # puts aa.to_s # # # Get a subsequence, bioinformatics style (first nucleotide is '1') # puts dna.subseq(2,6) # # #...more examples from the unit test Example of method documentation (from sequence.rb): # Usage: # my_seq = Bio::Sequence('AGGCACGAT') # my_na = my_seq.na # Function:: Converts the Bio::Sequence object into a Bio::Sequence::NA object # Returns:: a Bio::Sequence::NA object # Arguments:: none def na @seq = NA.new(@seq) @moltype = NA end As the time I can work on this is only limited, expect to see gradual additions to the cvs repository. Any other people wishing to help out are greatly welcome!! Of course, I promise not to touch other people's code, unless they explicitely tell me to. Any thoughts/suggestions on this? Kind regards, Jan Aerts, PhD Bioinformatics Group Roslin Institute Roslin, Scotland, UK +33 131 527 4200 ---------The obligatory disclaimer-------- The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From jan.aerts at bbsrc.ac.uk Mon Mar 6 09:55:08 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 6 Mar 2006 14:55:08 -0000 Subject: [BioRuby] bioruby documentation Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> Hi Ryan, (First of all: I think you sent this message to me alone, instead of the bioruby mailing list....) Glad to get the documentation discussion started again... The "as a way of thorougly understanding the use and structure of the classes" sound familiar... What do you think of using a standardized or (sound ugly:) formal format? Does your documentation include some of the synopsis/description/function/what it returns/arguments things? Do you think it is useful/feasible to put them in that format? Thanks, jan. > -----Original Message----- > From: Ryan Raaum [mailto:rlr215 at nyu.edu] > Sent: 06 March 2006 14:41 > To: jan aerts (RI) > Subject: Re: [BioRuby] bioruby documentation > > Good Morning All, > > I've had similar toughts to Jan, and am a couple methods away > from completely documenting Bio::Sequence::* . I was hoping > to send that in to Toshiaki later today. I haven't yet > written a synopsis or description for them, mainly because I > was using the process of documenting all the methods as a way > of thoroughly understanding the use and structure of the > classes. If the documentation I've currently written is seen > as reasonable and accepted, I would then add the overview > documentation for those classes and files. > > Is there somewhere we can note which parts different people > are working on documenting, so as to avoid any duplication of effort? > > Best! > > -Ryan > > On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote: > > > Hi all, > > > > Given the posts about bioruby documentation in the last few > months, my > > own experiences with bioruby and a bit of encouragement > from Toshiaki, > > I'd like to commence documenting bioruby classes (in CVS) > that are not > > documented yet, and to standardize the documentation format > for those > > that already have documentation. > > > > Documentation would take the form of rdoc, so that it would be > > browsable via the www.bioruby.org/rdoc website. > > > > Some guidelines that I would like to use in the documentation: > > (1) Each class should have a description and synopsis. If > there is a > > unit test at the bottom, this can easily be tweaked into a > synopsis. > > If such a unit test is available, 'documentating' would > mean (at least > > in the first round) 'tweaking and copying the unit test in > a comment > > in front of the class'. Alternatively, unit tests and documentation > > could be combined into one (as Ara and Pjotr discussed), > but I'm not > > experienced enough in ruby yet to do this in a simple, > transparent way. > > (2) Given the effort developers have put into writing the > classes, it > > would be nice if bioruby could reach as wide an audience as > possible. > > What I believe would help tremendously, is a standardized > format for > > documentation. By this I mean that the following > information is given > > for each method (sort of like in bioperl documentation): > > * synopsis > > * description > > * function > > * what it returns > > * any arguments > > (3) It should be made clear to the user if a class should be used > > directly, or if it just supports other classes (e.g. > > Bio::Sequence::Format). Additional important info would be > interaction > > with other classes (e.g. "how does the sequence class interact with > > the embl class?"). Original module writers have an > important role in > > describing this context. > > (4) Encapsule the copyright information between '#--' and > '#++', as it > > distracts the user from what he/she wants to know. (It _is_ > important, > > but not for the average user...) > > > > > > Example of class documentation (from sequence.rb): > > # = DESCRIPTION > > # The Bio::Sequence class generically describes a nucleic or amino > > acid sequence and is a superclass of # Bio::Sequence::NA and > > Bio::Sequence::AA. Most methods that can be used on Bio::Sequence > > objects are described # in Bio::Sequence::Common, Bio::Sequence::NA > > and Bio::Sequence::AA # # If possible, create sequence > objects using > > the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the > > Bio::Sequence # class will have to guess the type of > sequence you're > > talking about. > > # > > # = SYNOPSIS > > # # Create a nucleic or amino acid sequence > > # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') > > # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') > > # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') > > # > > # # Print it out > > # puts dna.to_s > > # puts aa.to_s > > # > > # # Get a subsequence, bioinformatics style (first > nucleotide is '1') > > # puts dna.subseq(2,6) > > # > > # #...more examples from the unit test > > > > Example of method documentation (from sequence.rb): > > # Usage: > > # my_seq = Bio::Sequence('AGGCACGAT') > > # my_na = my_seq.na > > # Function:: Converts the Bio::Sequence object into a > > Bio::Sequence::NA object > > # Returns:: a Bio::Sequence::NA object > > # Arguments:: none > > def na > > @seq = NA.new(@seq) > > @moltype = NA > > end > > > > As the time I can work on this is only limited, expect to > see gradual > > additions to the cvs repository. Any other people wishing > to help out > > are greatly welcome!! > > > > Of course, I promise not to touch other people's code, unless they > > explicitely tell me to. > > > > Any thoughts/suggestions on this? > > > > Kind regards, > > > > Jan Aerts, PhD > > Bioinformatics Group > > Roslin Institute > > Roslin, Scotland, UK > > +33 131 527 4200 > > > > ---------The obligatory disclaimer-------- The information > contained > > in this e-mail (including any attachments) is > > confidential and is intended for the use of the addressee > only. The > > opinions expressed within this e-mail (including any > attachments) are > > the opinions of the sender and do not necessarily > constitute those of > > Roslin Institute (Edinburgh) ("the Institute") unless specifically > > stated by a sender who is duly authorised to do so on behalf of the > > Institute. > > > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > From rlr215 at nyu.edu Mon Mar 6 10:14:06 2006 From: rlr215 at nyu.edu (Ryan Raaum) Date: Mon, 6 Mar 2006 10:14:06 -0500 Subject: [BioRuby] bioruby documentation In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> Message-ID: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> Hello again everyone! > > What do you think of using a standardized or (sound ugly:) formal > format? Does your documentation include some of the > synopsis/description/function/what it returns/arguments things? Do you > think it is useful/feasible to put them in that format? I think a reasonable standardization is a good thing, especially at the overview level of the class or module or whatever. Here's an example of what I've been writing for method documentation: (This is for subseq in Bio::Sequence::Common) # Returns a new sequence containing the subsequence identified by the start # and end numbers given as parameters. *Important:* Biological sequence # numbering conventions (one-based) rather than ruby's (zero-based) numbering # conventions are used. # # s = Bio::Sequence::Generic.new('atggaatga') # puts s.subseq(1,3) #=> "atg" # # Start defaults to 1 and end defaults to the entire existing string, so # subseq called without any parameters simply returns a new sequence identical # to the existing sequence. # # puts s.subseq #=> "atggaatga" # So, I haven't been writing enormously formal specs - which seem like a bit of overkill for most of the methods, and rdoc takes care of the basics of argument lists. Otherwise I note what to expect in return, or if the method does or does not modify the current object. Also if there are any things that are dangerous or tricky... I also give an example for all methods. It seems to me, and this is surely open to discussion, that formalizing the individual method descriptions too much makes them enormously tedious to write - so much so that very few will ever get written. BUT, on the class or module level, I think a certain amount of formalization is good, so that the overviews are reasonably consistent. Best, -Ryan > > Thanks, > jan. > >> -----Original Message----- >> From: Ryan Raaum [mailto:rlr215 at nyu.edu] >> Sent: 06 March 2006 14:41 >> To: jan aerts (RI) >> Subject: Re: [BioRuby] bioruby documentation >> >> Good Morning All, >> >> I've had similar toughts to Jan, and am a couple methods away >> from completely documenting Bio::Sequence::* . I was hoping >> to send that in to Toshiaki later today. I haven't yet >> written a synopsis or description for them, mainly because I >> was using the process of documenting all the methods as a way >> of thoroughly understanding the use and structure of the >> classes. If the documentation I've currently written is seen >> as reasonable and accepted, I would then add the overview >> documentation for those classes and files. >> >> Is there somewhere we can note which parts different people >> are working on documenting, so as to avoid any duplication of effort? >> >> Best! >> >> -Ryan >> >> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote: >> >>> Hi all, >>> >>> Given the posts about bioruby documentation in the last few >> months, my >>> own experiences with bioruby and a bit of encouragement >> from Toshiaki, >>> I'd like to commence documenting bioruby classes (in CVS) >> that are not >>> documented yet, and to standardize the documentation format >> for those >>> that already have documentation. >>> >>> Documentation would take the form of rdoc, so that it would be >>> browsable via the www.bioruby.org/rdoc website. >>> >>> Some guidelines that I would like to use in the documentation: >>> (1) Each class should have a description and synopsis. If >> there is a >>> unit test at the bottom, this can easily be tweaked into a >> synopsis. >>> If such a unit test is available, 'documentating' would >> mean (at least >>> in the first round) 'tweaking and copying the unit test in >> a comment >>> in front of the class'. Alternatively, unit tests and documentation >>> could be combined into one (as Ara and Pjotr discussed), >> but I'm not >>> experienced enough in ruby yet to do this in a simple, >> transparent way. >>> (2) Given the effort developers have put into writing the >> classes, it >>> would be nice if bioruby could reach as wide an audience as >> possible. >>> What I believe would help tremendously, is a standardized >> format for >>> documentation. By this I mean that the following >> information is given >>> for each method (sort of like in bioperl documentation): >>> * synopsis >>> * description >>> * function >>> * what it returns >>> * any arguments >>> (3) It should be made clear to the user if a class should be used >>> directly, or if it just supports other classes (e.g. >>> Bio::Sequence::Format). Additional important info would be >> interaction >>> with other classes (e.g. "how does the sequence class interact with >>> the embl class?"). Original module writers have an >> important role in >>> describing this context. >>> (4) Encapsule the copyright information between '#--' and >> '#++', as it >>> distracts the user from what he/she wants to know. (It _is_ >> important, >>> but not for the average user...) >>> >>> >>> Example of class documentation (from sequence.rb): >>> # = DESCRIPTION >>> # The Bio::Sequence class generically describes a nucleic or amino >>> acid sequence and is a superclass of # Bio::Sequence::NA and >>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence >>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA >>> and Bio::Sequence::AA # # If possible, create sequence >> objects using >>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the >>> Bio::Sequence # class will have to guess the type of >> sequence you're >>> talking about. >>> # >>> # = SYNOPSIS >>> # # Create a nucleic or amino acid sequence >>> # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') >>> # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') >>> # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') >>> # >>> # # Print it out >>> # puts dna.to_s >>> # puts aa.to_s >>> # >>> # # Get a subsequence, bioinformatics style (first >> nucleotide is '1') >>> # puts dna.subseq(2,6) >>> # >>> # #...more examples from the unit test >>> >>> Example of method documentation (from sequence.rb): >>> # Usage: >>> # my_seq = Bio::Sequence('AGGCACGAT') >>> # my_na = my_seq.na >>> # Function:: Converts the Bio::Sequence object into a >>> Bio::Sequence::NA object >>> # Returns:: a Bio::Sequence::NA object >>> # Arguments:: none >>> def na >>> @seq = NA.new(@seq) >>> @moltype = NA >>> end >>> >>> As the time I can work on this is only limited, expect to >> see gradual >>> additions to the cvs repository. Any other people wishing >> to help out >>> are greatly welcome!! >>> >>> Of course, I promise not to touch other people's code, unless they >>> explicitely tell me to. >>> >>> Any thoughts/suggestions on this? >>> >>> Kind regards, >>> >>> Jan Aerts, PhD >>> Bioinformatics Group >>> Roslin Institute >>> Roslin, Scotland, UK >>> +33 131 527 4200 >>> >>> ---------The obligatory disclaimer-------- The information >> contained >>> in this e-mail (including any attachments) is >>> confidential and is intended for the use of the addressee >> only. The >>> opinions expressed within this e-mail (including any >> attachments) are >>> the opinions of the sender and do not necessarily >> constitute those of >>> Roslin Institute (Edinburgh) ("the Institute") unless specifically >>> stated by a sender who is duly authorised to do so on behalf of the >>> Institute. >>> >>> _______________________________________________ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >> >> From jan.aerts at bbsrc.ac.uk Mon Mar 6 13:21:53 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 6 Mar 2006 18:21:53 -0000 Subject: [BioRuby] bioruby documentation References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk> Ryan, Nice piece of doc. I completely agree that the level of formalization is entirely open to discussion. And I completely understand your concerns. But on the other hand, a formalized list of things to be described can, in my opinion, _help_ developers document their code, rather than it would keep them from doing that. You can see it as a checklist of things to document. In your piece of code, you describe several aspects of the subseq method, but for every new method you'd describe, you'd need to have this list of things in the back of your head that you have to mention ("did I mention that it returns itself?" "did I mention what the defaults for the arguments are", ...). If we would have this list accessible on the wiki for any developer, he/she could copy it into their code and fill it in like a checklist. I suspect that would make things much easier on the developer (but that's my own view, of course). You're right that rdoc already takes care of argument lists, but it only lists them, instead of describing them. And in many instances, a bioruby user would have to know what the arguments actually are (including their defaults) without going into the code. Ergo: arguments should be documented. What do you think? jan -----Original Message----- From: Ryan Raaum [mailto:rlr215 at nyu.edu] Sent: Mon 3/6/2006 3:14 PM To: jan aerts (RI) Cc: bioruby at open-bio.org Subject: Re: [BioRuby] bioruby documentation Hello again everyone! > > What do you think of using a standardized or (sound ugly:) formal > format? Does your documentation include some of the > synopsis/description/function/what it returns/arguments things? Do you > think it is useful/feasible to put them in that format? I think a reasonable standardization is a good thing, especially at the overview level of the class or module or whatever. Here's an example of what I've been writing for method documentation: (This is for subseq in Bio::Sequence::Common) # Returns a new sequence containing the subsequence identified by the start # and end numbers given as parameters. *Important:* Biological sequence # numbering conventions (one-based) rather than ruby's (zero-based) numbering # conventions are used. # # s = Bio::Sequence::Generic.new('atggaatga') # puts s.subseq(1,3) #=> "atg" # # Start defaults to 1 and end defaults to the entire existing string, so # subseq called without any parameters simply returns a new sequence identical # to the existing sequence. # # puts s.subseq #=> "atggaatga" # So, I haven't been writing enormously formal specs - which seem like a bit of overkill for most of the methods, and rdoc takes care of the basics of argument lists. Otherwise I note what to expect in return, or if the method does or does not modify the current object. Also if there are any things that are dangerous or tricky... I also give an example for all methods. It seems to me, and this is surely open to discussion, that formalizing the individual method descriptions too much makes them enormously tedious to write - so much so that very few will ever get written. BUT, on the class or module level, I think a certain amount of formalization is good, so that the overviews are reasonably consistent. Best, -Ryan > > Thanks, > jan. > >> -----Original Message----- >> From: Ryan Raaum [mailto:rlr215 at nyu.edu] >> Sent: 06 March 2006 14:41 >> To: jan aerts (RI) >> Subject: Re: [BioRuby] bioruby documentation >> >> Good Morning All, >> >> I've had similar toughts to Jan, and am a couple methods away >> from completely documenting Bio::Sequence::* . I was hoping >> to send that in to Toshiaki later today. I haven't yet >> written a synopsis or description for them, mainly because I >> was using the process of documenting all the methods as a way >> of thoroughly understanding the use and structure of the >> classes. If the documentation I've currently written is seen >> as reasonable and accepted, I would then add the overview >> documentation for those classes and files. >> >> Is there somewhere we can note which parts different people >> are working on documenting, so as to avoid any duplication of effort? >> >> Best! >> >> -Ryan >> >> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote: >> >>> Hi all, >>> >>> Given the posts about bioruby documentation in the last few >> months, my >>> own experiences with bioruby and a bit of encouragement >> from Toshiaki, >>> I'd like to commence documenting bioruby classes (in CVS) >> that are not >>> documented yet, and to standardize the documentation format >> for those >>> that already have documentation. >>> >>> Documentation would take the form of rdoc, so that it would be >>> browsable via the www.bioruby.org/rdoc website. >>> >>> Some guidelines that I would like to use in the documentation: >>> (1) Each class should have a description and synopsis. If >> there is a >>> unit test at the bottom, this can easily be tweaked into a >> synopsis. >>> If such a unit test is available, 'documentating' would >> mean (at least >>> in the first round) 'tweaking and copying the unit test in >> a comment >>> in front of the class'. Alternatively, unit tests and documentation >>> could be combined into one (as Ara and Pjotr discussed), >> but I'm not >>> experienced enough in ruby yet to do this in a simple, >> transparent way. >>> (2) Given the effort developers have put into writing the >> classes, it >>> would be nice if bioruby could reach as wide an audience as >> possible. >>> What I believe would help tremendously, is a standardized >> format for >>> documentation. By this I mean that the following >> information is given >>> for each method (sort of like in bioperl documentation): >>> * synopsis >>> * description >>> * function >>> * what it returns >>> * any arguments >>> (3) It should be made clear to the user if a class should be used >>> directly, or if it just supports other classes (e.g. >>> Bio::Sequence::Format). Additional important info would be >> interaction >>> with other classes (e.g. "how does the sequence class interact with >>> the embl class?"). Original module writers have an >> important role in >>> describing this context. >>> (4) Encapsule the copyright information between '#--' and >> '#++', as it >>> distracts the user from what he/she wants to know. (It _is_ >> important, >>> but not for the average user...) >>> >>> >>> Example of class documentation (from sequence.rb): >>> # = DESCRIPTION >>> # The Bio::Sequence class generically describes a nucleic or amino >>> acid sequence and is a superclass of # Bio::Sequence::NA and >>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence >>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA >>> and Bio::Sequence::AA # # If possible, create sequence >> objects using >>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the >>> Bio::Sequence # class will have to guess the type of >> sequence you're >>> talking about. >>> # >>> # = SYNOPSIS >>> # # Create a nucleic or amino acid sequence >>> # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') >>> # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') >>> # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') >>> # >>> # # Print it out >>> # puts dna.to_s >>> # puts aa.to_s >>> # >>> # # Get a subsequence, bioinformatics style (first >> nucleotide is '1') >>> # puts dna.subseq(2,6) >>> # >>> # #...more examples from the unit test >>> >>> Example of method documentation (from sequence.rb): >>> # Usage: >>> # my_seq = Bio::Sequence('AGGCACGAT') >>> # my_na = my_seq.na >>> # Function:: Converts the Bio::Sequence object into a >>> Bio::Sequence::NA object >>> # Returns:: a Bio::Sequence::NA object >>> # Arguments:: none >>> def na >>> @seq = NA.new(@seq) >>> @moltype = NA >>> end >>> >>> As the time I can work on this is only limited, expect to >> see gradual >>> additions to the cvs repository. Any other people wishing >> to help out >>> are greatly welcome!! >>> >>> Of course, I promise not to touch other people's code, unless they >>> explicitely tell me to. >>> >>> Any thoughts/suggestions on this? >>> >>> Kind regards, >>> >>> Jan Aerts, PhD >>> Bioinformatics Group >>> Roslin Institute >>> Roslin, Scotland, UK >>> +33 131 527 4200 >>> >>> ---------The obligatory disclaimer-------- The information >> contained >>> in this e-mail (including any attachments) is >>> confidential and is intended for the use of the addressee >> only. The >>> opinions expressed within this e-mail (including any >> attachments) are >>> the opinions of the sender and do not necessarily >> constitute those of >>> Roslin Institute (Edinburgh) ("the Institute") unless specifically >>> stated by a sender who is duly authorised to do so on behalf of the >>> Institute. >>> >>> _______________________________________________ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >> >> From jan.aerts at bbsrc.ac.uk Mon Mar 6 13:41:52 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 6 Mar 2006 18:41:52 -0000 Subject: [BioRuby] bioruby documentation References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> <440C7A13.5070307@corevx.com> Message-ID: <84DA9D8AC9B05F4B889E7C70238CB45101FD6615@rie2ksrv1.ri.bbsrc.ac.uk> Hi Trevor, I agree that, if at all, only public methods should be documented. And, as you say, one or two lines of comment become commonplace. However, we have to keep in mind that it's the end-user that is the target for the docs. Myself having used BioPerl a lot, I found the included method-docs almost always sufficient for using them. The fact that some BioPerl developers did not adequately supply information (be it in that formal format) probably means that they would not provide documentation at all if not for that standard. If the consensus would be _not_ to document the methods, I'll of course go for that. What do the heavy-weights think? jan. -----Original Message----- From: Trevor Wennblom [mailto:trevor at corevx.com] Sent: Mon 3/6/2006 6:06 PM To: jan aerts (RI); bioruby at open-bio.org Subject: Re: [BioRuby] bioruby documentation jan aerts (RI) wrote: > (2) Given the effort developers have put into writing the classes, it > would be nice if bioruby could reach as wide an audience as possible. > What I believe would help tremendously, is a standardized format for > documentation. By this I mean that the following information is given > for each method (sort of like in bioperl documentation): > * synopsis > * description > * function > * what it returns > * any arguments > Hi Jan, Thanks for taking the initiative on this important subject! Coming up with a standard for documenting the major classes and modules would be a great idea, I've tried my best on the components that I've written so far. I'm going to agree with Ryan that documenting every method is likely overkill. One of the beauties of Ruby is that one or two line methods become commonplace. Often to read BioPerl code (where they do generally have every method formally documented) I strip out the comments since they dominate the code to such a degree as to be distracting, and the comments are often just there to meet spec but not provide useful information. If we were to require documentation of methods I would say that it should only be required for public methods. > (4) Encapsule the copyright information between '#--' and '#++', as it > distracts the user from what he/she wants to know. (It _is_ important, > but not for the average user...) > We're switching to the Ruby license, correct? Do we even need anything beyond "License:: Ruby"? Thanks again, Trevor From rlr215 at nyu.edu Mon Mar 6 13:46:12 2006 From: rlr215 at nyu.edu (Ryan Raaum) Date: Mon, 6 Mar 2006 13:46:12 -0500 Subject: [BioRuby] bioruby documentation In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> <84DA9D8AC9B05F4B889E7C70238CB45101FD6614@rie2ksrv1.ri.bbsrc.ac.uk> Message-ID: <4db8cd9e5949c71fdba3ec26ffa389dd@nyu.edu> Hi all (again!), Putting the formalization into a more concrete perspective, compare: an example from the bioperl docs: http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html and an example from the Ruby on Rails docs: http://api.rubyonrails.org/classes/ActionController/Base.html The bioperl example is very formalized, so it is true that nothing is left out. However, it doesn't read very well and most of the method documentation ends up being highly repetitive: (To caricature... :) Title : do_something Usage : Object.do_something Function: does something Returns : something Args : precursor to something Whereas (in my mind), the rails documentation reads very well, simple methods are simply documented, complex methods are documented in detail. If the arguments are absent or obvious, don't talk about them; if the arguments are tricky, do talk about them. And so on. No one really *wants* to document, and if documenting is annoying (= overly formalized), no one will. I think a consistent, relatively formalized overview is good, but that overly formalized method and attribute documentation guidelines ultimately mean that little to no documentation will get done because it's too annoying (in most real-world open source projects). Best, Ryan On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote: > Ryan, > > Nice piece of doc. I completely agree that the level of formalization > is entirely open to discussion. And I completely understand your > concerns. But on the other hand, a formalized list of things to be > described can, in my opinion, _help_ developers document their code, > rather than it would keep them from doing that. You can see it as a > checklist of things to document. In your piece of code, you describe > several aspects of the subseq method, but for every new method you'd > describe, you'd need to have this list of things in the back of your > head that you have to mention ("did I mention that it returns itself?" > "did I mention what the defaults for the arguments are", ...). If we > would have this list accessible on the wiki for any developer, he/she > could copy it into their code and fill it in like a checklist. I > suspect that would make things much easier on the developer (but > that's my own view, of course). > > You're right that rdoc already takes care of argument lists, but it > only lists them, instead of describing them. And in many instances, a > bioruby user would have to know what the arguments actually are > (including their defaults) without going into the code. Ergo: > arguments should be documented. > > What do you think? > jan > > > -----Original Message----- > From: Ryan Raaum [mailto:rlr215 at nyu.edu] > Sent: Mon 3/6/2006 3:14 PM > To: jan aerts (RI) > Cc: bioruby at open-bio.org > Subject: Re: [BioRuby] bioruby documentation > > > Hello again everyone! > >> >> What do you think of using a standardized or (sound ugly:) formal >> format? Does your documentation include some of the >> synopsis/description/function/what it returns/arguments things? Do you >> think it is useful/feasible to put them in that format? > > I think a reasonable standardization is a good thing, especially at the > overview level of the class or module or whatever. Here's an example > of what I've been writing for method documentation: > > (This is for subseq in Bio::Sequence::Common) > > # Returns a new sequence containing the subsequence identified by > the > start > # and end numbers given as parameters. *Important:* Biological > sequence > # numbering conventions (one-based) rather than ruby's (zero-based) > numbering > # conventions are used. > # > # s = Bio::Sequence::Generic.new('atggaatga') > # puts s.subseq(1,3) #=> "atg" > # > # Start defaults to 1 and end defaults to the entire existing > string, > so > # subseq called without any parameters simply returns a new sequence > identical > # to the existing sequence. > # > # puts s.subseq #=> "atggaatga" > # > > So, I haven't been writing enormously formal specs - which seem like a > bit of overkill for most of the methods, and rdoc takes care of the > basics of argument lists. Otherwise I note what to expect in return, > or if the method does or does not modify the current object. Also if > there are any things that are dangerous or tricky... I also give an > example for all methods. > > It seems to me, and this is surely open to discussion, that formalizing > the individual method descriptions too much makes them enormously > tedious to write - so much so that very few will ever get written. > BUT, on the class or module level, I think a certain amount of > formalization is good, so that the overviews are reasonably consistent. > > Best, > > -Ryan > >> >> Thanks, >> jan. >> >>> -----Original Message----- >>> From: Ryan Raaum [mailto:rlr215 at nyu.edu] >>> Sent: 06 March 2006 14:41 >>> To: jan aerts (RI) >>> Subject: Re: [BioRuby] bioruby documentation >>> >>> Good Morning All, >>> >>> I've had similar toughts to Jan, and am a couple methods away >>> from completely documenting Bio::Sequence::* . I was hoping >>> to send that in to Toshiaki later today. I haven't yet >>> written a synopsis or description for them, mainly because I >>> was using the process of documenting all the methods as a way >>> of thoroughly understanding the use and structure of the >>> classes. If the documentation I've currently written is seen >>> as reasonable and accepted, I would then add the overview >>> documentation for those classes and files. >>> >>> Is there somewhere we can note which parts different people >>> are working on documenting, so as to avoid any duplication of effort? >>> >>> Best! >>> >>> -Ryan >>> >>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote: >>> >>>> Hi all, >>>> >>>> Given the posts about bioruby documentation in the last few >>> months, my >>>> own experiences with bioruby and a bit of encouragement >>> from Toshiaki, >>>> I'd like to commence documenting bioruby classes (in CVS) >>> that are not >>>> documented yet, and to standardize the documentation format >>> for those >>>> that already have documentation. >>>> >>>> Documentation would take the form of rdoc, so that it would be >>>> browsable via the www.bioruby.org/rdoc website. >>>> >>>> Some guidelines that I would like to use in the documentation: >>>> (1) Each class should have a description and synopsis. If >>> there is a >>>> unit test at the bottom, this can easily be tweaked into a >>> synopsis. >>>> If such a unit test is available, 'documentating' would >>> mean (at least >>>> in the first round) 'tweaking and copying the unit test in >>> a comment >>>> in front of the class'. Alternatively, unit tests and documentation >>>> could be combined into one (as Ara and Pjotr discussed), >>> but I'm not >>>> experienced enough in ruby yet to do this in a simple, >>> transparent way. >>>> (2) Given the effort developers have put into writing the >>> classes, it >>>> would be nice if bioruby could reach as wide an audience as >>> possible. >>>> What I believe would help tremendously, is a standardized >>> format for >>>> documentation. By this I mean that the following >>> information is given >>>> for each method (sort of like in bioperl documentation): >>>> * synopsis >>>> * description >>>> * function >>>> * what it returns >>>> * any arguments >>>> (3) It should be made clear to the user if a class should be used >>>> directly, or if it just supports other classes (e.g. >>>> Bio::Sequence::Format). Additional important info would be >>> interaction >>>> with other classes (e.g. "how does the sequence class interact with >>>> the embl class?"). Original module writers have an >>> important role in >>>> describing this context. >>>> (4) Encapsule the copyright information between '#--' and >>> '#++', as it >>>> distracts the user from what he/she wants to know. (It _is_ >>> important, >>>> but not for the average user...) >>>> >>>> >>>> Example of class documentation (from sequence.rb): >>>> # = DESCRIPTION >>>> # The Bio::Sequence class generically describes a nucleic or amino >>>> acid sequence and is a superclass of # Bio::Sequence::NA and >>>> Bio::Sequence::AA. Most methods that can be used on Bio::Sequence >>>> objects are described # in Bio::Sequence::Common, Bio::Sequence::NA >>>> and Bio::Sequence::AA # # If possible, create sequence >>> objects using >>>> the Bio::Sequence::NA or Bio::Sequence::AA classes instead, as the >>>> Bio::Sequence # class will have to guess the type of >>> sequence you're >>>> talking about. >>>> # >>>> # = SYNOPSIS >>>> # # Create a nucleic or amino acid sequence >>>> # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') >>>> # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') >>>> # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') >>>> # >>>> # # Print it out >>>> # puts dna.to_s >>>> # puts aa.to_s >>>> # >>>> # # Get a subsequence, bioinformatics style (first >>> nucleotide is '1') >>>> # puts dna.subseq(2,6) >>>> # >>>> # #...more examples from the unit test >>>> >>>> Example of method documentation (from sequence.rb): >>>> # Usage: >>>> # my_seq = Bio::Sequence('AGGCACGAT') >>>> # my_na = my_seq.na >>>> # Function:: Converts the Bio::Sequence object into a >>>> Bio::Sequence::NA object >>>> # Returns:: a Bio::Sequence::NA object >>>> # Arguments:: none >>>> def na >>>> @seq = NA.new(@seq) >>>> @moltype = NA >>>> end >>>> >>>> As the time I can work on this is only limited, expect to >>> see gradual >>>> additions to the cvs repository. Any other people wishing >>> to help out >>>> are greatly welcome!! >>>> >>>> Of course, I promise not to touch other people's code, unless they >>>> explicitely tell me to. >>>> >>>> Any thoughts/suggestions on this? >>>> >>>> Kind regards, >>>> >>>> Jan Aerts, PhD >>>> Bioinformatics Group >>>> Roslin Institute >>>> Roslin, Scotland, UK >>>> +33 131 527 4200 >>>> >>>> ---------The obligatory disclaimer-------- The information >>> contained >>>> in this e-mail (including any attachments) is >>>> confidential and is intended for the use of the addressee >>> only. The >>>> opinions expressed within this e-mail (including any >>> attachments) are >>>> the opinions of the sender and do not necessarily >>> constitute those of >>>> Roslin Institute (Edinburgh) ("the Institute") unless specifically >>>> stated by a sender who is duly authorised to do so on behalf of the >>>> Institute. >>>> >>>> _______________________________________________ >>>> BioRuby mailing list >>>> BioRuby at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioruby >>> >>> > > From trevor at corevx.com Mon Mar 6 13:06:11 2006 From: trevor at corevx.com (Trevor Wennblom) Date: Mon, 06 Mar 2006 12:06:11 -0600 Subject: [BioRuby] bioruby documentation In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> Message-ID: <440C7A13.5070307@corevx.com> jan aerts (RI) wrote: > (2) Given the effort developers have put into writing the classes, it > would be nice if bioruby could reach as wide an audience as possible. > What I believe would help tremendously, is a standardized format for > documentation. By this I mean that the following information is given > for each method (sort of like in bioperl documentation): > * synopsis > * description > * function > * what it returns > * any arguments > Hi Jan, Thanks for taking the initiative on this important subject! Coming up with a standard for documenting the major classes and modules would be a great idea, I've tried my best on the components that I've written so far. I'm going to agree with Ryan that documenting every method is likely overkill. One of the beauties of Ruby is that one or two line methods become commonplace. Often to read BioPerl code (where they do generally have every method formally documented) I strip out the comments since they dominate the code to such a degree as to be distracting, and the comments are often just there to meet spec but not provide useful information. If we were to require documentation of methods I would say that it should only be required for public methods. > (4) Encapsule the copyright information between '#--' and '#++', as it > distracts the user from what he/she wants to know. (It _is_ important, > but not for the average user...) > We're switching to the Ruby license, correct? Do we even need anything beyond "License:: Ruby"? Thanks again, Trevor From ktym at hgc.jp Tue Mar 7 00:57:44 2006 From: ktym at hgc.jp (Toshiaki Katayama) Date: Tue, 7 Mar 2006 14:57:44 +0900 Subject: [BioRuby] bioruby documentation In-Reply-To: <440C7A13.5070307@corevx.com> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABCF@rie2ksrv1.ri.bbsrc.ac.uk> <440C7A13.5070307@corevx.com> Message-ID: Hi, Thanks for a lot of discussions. * format of RDoc and level of detail - still need discussion For readability in terminal, please fold docs within 79 columns (example code in doc would break this principle). Please use "#" prefixed style and don't use =begin rdoc/=end pairs as it makes impossible to read the code without coloring. Basically agreed to have standardized format as Jan suggested. It will make clear what should be documented at least, especially for the non-native developers. Also agreed with Ryan's comparison - standardized format can be repetitive - simple methods are simply documented, complex methods are documented in detail It looks ideal to have adequate dose of documentation and it will also require some writing skill. I'm really happy if some of you could lead to fill BioRuby with nice level of documentation. * license - please change to Ruby's Core Japanese developers are agreed to change license from LGPL to Ruby's to make everyone who use Ruby can use BioRuby (re-writing of header is not yet completed in some modules, though). We need to ask other contributors to follow this change - ask their permission that we can change the license whenever BioRuby staff needs. * where and how to include data (enzyme.yaml for REBASE) This is under discussion with Trevor but I think all discussions should be done on this list to have audience (to tell the truth, reading/writing English mails require time for me, so wants to share them without posting summary in addition:). Toshiaki From jan.aerts at bbsrc.ac.uk Tue Mar 7 06:01:26 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Tue, 7 Mar 2006 11:01:26 -0000 Subject: [BioRuby] bioruby documentation Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DABD4@rie2ksrv1.ri.bbsrc.ac.uk> Good morning again. If I understand correctly, the general feeling is that the class-level docs should have a bit of standardized bit in them (a description, an example) and that method-level docs should not be too elaborate if that's not necessary. How about (given Toshiaki's comment of "[making] clear what should be documented at least, especially for the non-native developers") for _non-trivial_ methods giving a description, an example, and the type of thing it returns? (These could or could not be nicely put under separate headers; con: can look bloated, pro: speeds up browsing if you want to know what a method returns). I think the Bio::Sequence::Common#window_search is a nice example: tells you what it's meant to do, gives an example, and says what it returns. So what do you think of the following: * standardized parts for class-level docs: description, example, and if necessary: relationship to other classes * for complex methods: use Bio::Sequence::Common#window_search as an example (with or without little title thingies) * for simple methods: use simple methods of the Rails ActionController::Base as an example: just a one-line description As rdoc takes care of listing arguments to a class: is there a way to let it show automatically if an argument is mandatory or not? jan. > -----Original Message----- > From: Ryan Raaum [mailto:rlr215 at nyu.edu] > Sent: 06 March 2006 18:46 > To: jan aerts (RI) > Cc: bioruby at open-bio.org > Subject: Re: [BioRuby] bioruby documentation > > Hi all (again!), > > Putting the formalization into a more concrete perspective, compare: > > an example from the bioperl docs: > http://doc.bioperl.org/releases/bioperl-1.0.1/Bio/Tools/SeqStats.html > > and an example from the Ruby on Rails docs: > http://api.rubyonrails.org/classes/ActionController/Base.html > > The bioperl example is very formalized, so it is true that > nothing is left out. However, it doesn't read very well and > most of the method documentation ends up being highly > repetitive: (To caricature... :) > > Title : do_something > Usage : Object.do_something > Function: does something > Returns : something > Args : precursor to something > > Whereas (in my mind), the rails documentation reads very > well, simple methods are simply documented, complex methods > are documented in detail. If the arguments are absent or > obvious, don't talk about them; if the arguments are tricky, > do talk about them. And so on. No one really *wants* to > document, and if documenting is annoying (= overly > formalized), no one will. > > I think a consistent, relatively formalized overview is good, > but that overly formalized method and attribute documentation > guidelines ultimately mean that little to no documentation > will get done because it's too annoying (in most real-world > open source projects). > > Best, > > Ryan > > On Mar 6, 2006, at 1:21 PM, jan aerts (RI) wrote: > > > Ryan, > > > > Nice piece of doc. I completely agree that the level of > formalization > > is entirely open to discussion. And I completely understand your > > concerns. But on the other hand, a formalized list of things to be > > described can, in my opinion, _help_ developers document > their code, > > rather than it would keep them from doing that. You can see it as a > > checklist of things to document. In your piece of code, you > describe > > several aspects of the subseq method, but for every new > method you'd > > describe, you'd need to have this list of things in the > back of your > > head that you have to mention ("did I mention that it > returns itself?" > > "did I mention what the defaults for the arguments are", > ...). If we > > would have this list accessible on the wiki for any > developer, he/she > > could copy it into their code and fill it in like a checklist. I > > suspect that would make things much easier on the developer (but > > that's my own view, of course). > > > > You're right that rdoc already takes care of argument lists, but it > > only lists them, instead of describing them. And in many > instances, a > > bioruby user would have to know what the arguments actually are > > (including their defaults) without going into the code. Ergo: > > arguments should be documented. > > > > What do you think? > > jan > > > > > > -----Original Message----- > > From: Ryan Raaum [mailto:rlr215 at nyu.edu] > > Sent: Mon 3/6/2006 3:14 PM > > To: jan aerts (RI) > > Cc: bioruby at open-bio.org > > Subject: Re: [BioRuby] bioruby documentation > > > > > > Hello again everyone! > > > >> > >> What do you think of using a standardized or (sound ugly:) formal > >> format? Does your documentation include some of the > >> synopsis/description/function/what it returns/arguments things? Do > >> you think it is useful/feasible to put them in that format? > > > > I think a reasonable standardization is a good thing, especially at > > the overview level of the class or module or whatever. Here's an > > example of what I've been writing for method documentation: > > > > (This is for subseq in Bio::Sequence::Common) > > > > # Returns a new sequence containing the subsequence > identified by > > the start > > # and end numbers given as parameters. *Important:* Biological > > sequence > > # numbering conventions (one-based) rather than ruby's > (zero-based) > > numbering > > # conventions are used. > > # > > # s = Bio::Sequence::Generic.new('atggaatga') > > # puts s.subseq(1,3) #=> "atg" > > # > > # Start defaults to 1 and end defaults to the entire existing > > string, so > > # subseq called without any parameters simply returns a new > > sequence identical > > # to the existing sequence. > > # > > # puts s.subseq #=> "atggaatga" > > # > > > > So, I haven't been writing enormously formal specs - which > seem like a > > bit of overkill for most of the methods, and rdoc takes care of the > > basics of argument lists. Otherwise I note what to expect > in return, > > or if the method does or does not modify the current > object. Also if > > there are any things that are dangerous or tricky... I > also give an > > example for all methods. > > > > It seems to me, and this is surely open to discussion, that > > formalizing the individual method descriptions too much makes them > > enormously tedious to write - so much so that very few will > ever get written. > > BUT, on the class or module level, I think a certain amount of > > formalization is good, so that the overviews are reasonably > consistent. > > > > Best, > > > > -Ryan > > > >> > >> Thanks, > >> jan. > >> > >>> -----Original Message----- > >>> From: Ryan Raaum [mailto:rlr215 at nyu.edu] > >>> Sent: 06 March 2006 14:41 > >>> To: jan aerts (RI) > >>> Subject: Re: [BioRuby] bioruby documentation > >>> > >>> Good Morning All, > >>> > >>> I've had similar toughts to Jan, and am a couple methods > away from > >>> completely documenting Bio::Sequence::* . I was hoping > to send that > >>> in to Toshiaki later today. I haven't yet written a synopsis or > >>> description for them, mainly because I was using the process of > >>> documenting all the methods as a way of thoroughly > understanding the > >>> use and structure of the classes. If the documentation I've > >>> currently written is seen as reasonable and accepted, I > would then > >>> add the overview documentation for those classes and files. > >>> > >>> Is there somewhere we can note which parts different people are > >>> working on documenting, so as to avoid any duplication of effort? > >>> > >>> Best! > >>> > >>> -Ryan > >>> > >>> On Mar 6, 2006, at 9:21 AM, jan aerts (RI) wrote: > >>> > >>>> Hi all, > >>>> > >>>> Given the posts about bioruby documentation in the last few > >>> months, my > >>>> own experiences with bioruby and a bit of encouragement > >>> from Toshiaki, > >>>> I'd like to commence documenting bioruby classes (in CVS) > >>> that are not > >>>> documented yet, and to standardize the documentation format > >>> for those > >>>> that already have documentation. > >>>> > >>>> Documentation would take the form of rdoc, so that it would be > >>>> browsable via the www.bioruby.org/rdoc website. > >>>> > >>>> Some guidelines that I would like to use in the documentation: > >>>> (1) Each class should have a description and synopsis. If > >>> there is a > >>>> unit test at the bottom, this can easily be tweaked into a > >>> synopsis. > >>>> If such a unit test is available, 'documentating' would > >>> mean (at least > >>>> in the first round) 'tweaking and copying the unit test in > >>> a comment > >>>> in front of the class'. Alternatively, unit tests and > documentation > >>>> could be combined into one (as Ara and Pjotr discussed), > >>> but I'm not > >>>> experienced enough in ruby yet to do this in a simple, > >>> transparent way. > >>>> (2) Given the effort developers have put into writing the > >>> classes, it > >>>> would be nice if bioruby could reach as wide an audience as > >>> possible. > >>>> What I believe would help tremendously, is a standardized > >>> format for > >>>> documentation. By this I mean that the following > >>> information is given > >>>> for each method (sort of like in bioperl documentation): > >>>> * synopsis > >>>> * description > >>>> * function > >>>> * what it returns > >>>> * any arguments > >>>> (3) It should be made clear to the user if a class > should be used > >>>> directly, or if it just supports other classes (e.g. > >>>> Bio::Sequence::Format). Additional important info would be > >>> interaction > >>>> with other classes (e.g. "how does the sequence class > interact with > >>>> the embl class?"). Original module writers have an > >>> important role in > >>>> describing this context. > >>>> (4) Encapsule the copyright information between '#--' and > >>> '#++', as it > >>>> distracts the user from what he/she wants to know. (It _is_ > >>> important, > >>>> but not for the average user...) > >>>> > >>>> > >>>> Example of class documentation (from sequence.rb): > >>>> # = DESCRIPTION > >>>> # The Bio::Sequence class generically describes a > nucleic or amino > >>>> acid sequence and is a superclass of # Bio::Sequence::NA and > >>>> Bio::Sequence::AA. Most methods that can be used on > Bio::Sequence > >>>> objects are described # in Bio::Sequence::Common, > Bio::Sequence::NA > >>>> and Bio::Sequence::AA # # If possible, create sequence > >>> objects using > >>>> the Bio::Sequence::NA or Bio::Sequence::AA classes > instead, as the > >>>> Bio::Sequence # class will have to guess the type of > >>> sequence you're > >>>> talking about. > >>>> # > >>>> # = SYNOPSIS > >>>> # # Create a nucleic or amino acid sequence > >>>> # dna = Bio::Sequence::NA.new('atgcatgcATGCATGCAAAA') > >>>> # rna = Bio::Sequence::NA.new('augcaugcaugcaugcaaaa') > >>>> # aa = Bio::Sequence::AA.new('ACDEFGHIKLMNPQRSTVWYU') > >>>> # > >>>> # # Print it out > >>>> # puts dna.to_s > >>>> # puts aa.to_s > >>>> # > >>>> # # Get a subsequence, bioinformatics style (first > >>> nucleotide is '1') > >>>> # puts dna.subseq(2,6) > >>>> # > >>>> # #...more examples from the unit test > >>>> > >>>> Example of method documentation (from sequence.rb): > >>>> # Usage: > >>>> # my_seq = Bio::Sequence('AGGCACGAT') > >>>> # my_na = my_seq.na > >>>> # Function:: Converts the Bio::Sequence object into a > >>>> Bio::Sequence::NA object > >>>> # Returns:: a Bio::Sequence::NA object > >>>> # Arguments:: none > >>>> def na > >>>> @seq = NA.new(@seq) > >>>> @moltype = NA > >>>> end > >>>> > >>>> As the time I can work on this is only limited, expect to > >>> see gradual > >>>> additions to the cvs repository. Any other people wishing > >>> to help out > >>>> are greatly welcome!! > >>>> > >>>> Of course, I promise not to touch other people's code, > unless they > >>>> explicitely tell me to. > >>>> > >>>> Any thoughts/suggestions on this? > >>>> > >>>> Kind regards, > >>>> > >>>> Jan Aerts, PhD > >>>> Bioinformatics Group > >>>> Roslin Institute > >>>> Roslin, Scotland, UK > >>>> +33 131 527 4200 > >>>> > >>>> ---------The obligatory disclaimer-------- The information > >>> contained > >>>> in this e-mail (including any attachments) is > confidential and is > >>>> intended for the use of the addressee > >>> only. The > >>>> opinions expressed within this e-mail (including any > >>> attachments) are > >>>> the opinions of the sender and do not necessarily > >>> constitute those of > >>>> Roslin Institute (Edinburgh) ("the Institute") unless > specifically > >>>> stated by a sender who is duly authorised to do so on > behalf of the > >>>> Institute. > >>>> > >>>> _______________________________________________ > >>>> BioRuby mailing list > >>>> BioRuby at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioruby > >>> > >>> > > > > > > From ktym at hgc.jp Tue Mar 7 08:38:42 2006 From: ktym at hgc.jp (Toshiaki Katayama) Date: Tue, 7 Mar 2006 22:38:42 +0900 Subject: [BioRuby] bioruby documentation In-Reply-To: <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> Message-ID: <895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp> Resend: as this message seems not delivered according to http://open-bio.org/pipermail/bioruby/2006-March/date.html -k Ryan, Thank you for your very nice doc. On 2006/03/07, at 0:14, Ryan Raaum wrote: > # s = Bio::Sequence::Generic.new('atggaatga') If you will utilize this documentation, please change this example to use Bio::Sequence::NA. Bio::Sequence::Generic is just for developers - to hold gaps, spaces etc. intact - mainly for multiple alignment. In that sense, the following mail you send me personally should be fixed as: Begin forwarded message: > From: Ryan Raaum > Date: 2006?3?7? 0:47:32:JST > To: Toshiaki Katayama > Cc: jan aerts (RI) > Subject: BioRuby Sequence Documentation Patch > > Hello, > > I have begun work on some documentation (as you may have seen from the messages on the mailing list just this morning). Here is what I've done. All methods and attributes in the Bio::Sequence hierarchy should be documented. A summary for each file is yet to be written, but if this documentation is acceptable, I will write those after these are applied. I made two small code changes as well: > > 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect. The Bio::Sequence::Generic is not intended to strip. In addition to this, why you made Bio::Sequence#guess method as :nodoc: ? Seuence type guessing is not perfect so there need to be an interface to change threshold etc. Most of other parts seems to be acceptable - you really understood the behind ideas! On March 1, Jan also sent me a documented version of sequence.rb. I'm sorry that I should post it on the list as soon as possible. Anyway, could you contact him to merge your documentations? If both of you are agreed, I'll commit your patch. > 2. Modified the instance randomize method to start at length 0 IF a composition hash is given. Otherwise, if there was an actual sequence AND a hash was given, odd things would happen. (Of course, it was never meant to be called that way, but... as it CAN be called that way, I thought the behavior should be consistent.) Thanks! > I am happy to make modifications to this documentation, > > Best wishes, > > Ryan Raaum > -------------- next part -------------- From ktym at hgc.jp Tue Mar 7 12:34:41 2006 From: ktym at hgc.jp (Toshiaki Katayama) Date: Wed, 8 Mar 2006 02:34:41 +0900 Subject: [BioRuby] bioruby documentation In-Reply-To: <5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu> References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> <895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp> <5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu> Message-ID: Ryan, Thank you for your quick fix. I quote your mail as it was not posted to the list. > Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function. Thanks. This bug was introduced when I added length and index arguments... Toshiaki On 2006/03/08, at 0:56, Ryan Raaum wrote: > Hi All, > > >> >> On 2006/03/07, at 0:14, Ryan Raaum wrote: >>> # s = Bio::Sequence::Generic.new('atggaatga') >> >> If you will utilize this documentation, please change >> this example to use Bio::Sequence::NA. >> > > Done. (For this and all other examples using Bio::Sequence::Generic) > >> Bio::Sequence::Generic is just for developers - to hold >> gaps, spaces etc. intact - mainly for multiple alignment. >> > > Made Bio::Sequence::Generic :nodoc: > >>> >>> 1. Added a whitespace stripping initialize method to Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA and Bio::Sequence::NA in that respect. >> >> >> The Bio::Sequence::Generic is not intended to strip. > > Removed the added method. > >> >> In addition to this, why you made Bio::Sequence#guess method as :nodoc: ? >> Seuence type guessing is not perfect so there need to be an interface >> to change threshold etc. > > Documented the guess methods. > >> On March 1, Jan also sent me a documented version of sequence.rb. >> I'm sorry that I should post it on the list as soon as possible. >> >> Anyway, could you contact him to merge your documentations? >> If both of you are agreed, I'll commit your patch. > > If Jan will send me his sequence.rb, I can merge it with mine and send the merged file back to Jan. After he's edited the merge to his liking, we can put it all together and send it in as a unified patch. > > > Also, in the last round of editing, I made a small change to the Bio::Sequence#guess function. In the line where the "total" is calculated, the original version used the length of the @seq as the starting length, but for the length and index parameters to work properly with the threshold value, the length of the guess string (`str` is the local method variable) is what should be the base length. > > Best, > > -Ryan > From rlr215 at nyu.edu Wed Mar 8 10:04:18 2006 From: rlr215 at nyu.edu (Ryan Raaum) Date: Wed, 8 Mar 2006 10:04:18 -0500 Subject: [BioRuby] Sequence Documentation Patch In-Reply-To: References: <84DA9D8AC9B05F4B889E7C70238CB451030DABD0@rie2ksrv1.ri.bbsrc.ac.uk> <3df70c206fe0aa94a66723916fe3aa83@nyu.edu> <895B06A1-AC96-4BD2-8A79-7115433EB355@hgc.jp> <5dc516be17aa37f0c0ff5651eb41a3d5@nyu.edu> Message-ID: Good Morning, Jan and I were able to reconcile our respective documentation attempts into a single documentation patch. Here's an example of the final format: (documentation for the Bio::Sequence::Common#subseq method) # Returns a new sequence containing the subsequence identified by the # start and end numbers given as parameters. *Important:* Biological # sequence numbering conventions (one-based) rather than ruby's # (zero-based) numbering conventions are used. # # s = Bio::Sequence::NA.new('atggaatga') # puts s.subseq(1,3) #=> "atg" # # Start defaults to 1 and end defaults to the entire existing string, so # subseq called without any parameters simply returns a new sequence # identical to the existing sequence. # # puts s.subseq #=> "atggaatga" # --- # *Arguments*: # * (optional) _s_(start): Integer (default 1) # * (optional) _e_(end): Integer (default current sequence length) # *Returns*:: new Bio::Sequence::NA/AA object Hopefully this will be useful for new users. Those changes from the first version of this patch that Toshiaki noted as being wrong or against the API were removed. I also made two more bug fixes (in addition to those already described). 1. Added 'U' and 'u' to the bases counted towards the nucleic acid total in Bio::Sequence#guess. (Without this, RNA sequences were "guessed" to be Amino Acid sequences). 2. Changed the arguments for method_missing in Bio::Sequence from (*arg) to (sym, *args, &block). With this argument set, blocks will be properly passed through to the encapsulated object. Cheers! -Ryan -------------- next part -------------- On Mar 7, 2006, at 12:34 PM, Toshiaki Katayama wrote: > Ryan, > > Thank you for your quick fix. > I quote your mail as it was not posted to the list. > >> Also, in the last round of editing, I made a small change to the >> Bio::Sequence#guess function. > > Thanks. This bug was introduced when I added length and index > arguments... > > Toshiaki > > > On 2006/03/08, at 0:56, Ryan Raaum wrote: > >> Hi All, >> >> >>> >>> On 2006/03/07, at 0:14, Ryan Raaum wrote: >>>> # s = Bio::Sequence::Generic.new('atggaatga') >>> >>> If you will utilize this documentation, please change >>> this example to use Bio::Sequence::NA. >>> >> >> Done. (For this and all other examples using Bio::Sequence::Generic) >> >>> Bio::Sequence::Generic is just for developers - to hold >>> gaps, spaces etc. intact - mainly for multiple alignment. >>> >> >> Made Bio::Sequence::Generic :nodoc: >> >>>> >>>> 1. Added a whitespace stripping initialize method to >>>> Bio::Sequence::Generic to make it consistent with Bio::Sequence::AA >>>> and Bio::Sequence::NA in that respect. >>> >>> >>> The Bio::Sequence::Generic is not intended to strip. >> >> Removed the added method. >> >>> >>> In addition to this, why you made Bio::Sequence#guess method as >>> :nodoc: ? >>> Seuence type guessing is not perfect so there need to be an interface >>> to change threshold etc. >> >> Documented the guess methods. >> >>> On March 1, Jan also sent me a documented version of sequence.rb. >>> I'm sorry that I should post it on the list as soon as possible. >>> >>> Anyway, could you contact him to merge your documentations? >>> If both of you are agreed, I'll commit your patch. >> >> If Jan will send me his sequence.rb, I can merge it with mine and >> send the merged file back to Jan. After he's edited the merge to his >> liking, we can put it all together and send it in as a unified patch. >> >> >> Also, in the last round of editing, I made a small change to the >> Bio::Sequence#guess function. In the line where the "total" is >> calculated, the original version used the length of the @seq as the >> starting length, but for the length and index parameters to work >> properly with the threshold value, the length of the guess string >> (`str` is the local method variable) is what should be the base >> length. >> >> Best, >> >> -Ryan >> > > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From jan.aerts at bbsrc.ac.uk Tue Mar 21 07:38:30 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Tue, 21 Mar 2006 12:38:30 -0000 Subject: [BioRuby] fastacmd.rb: iteration Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk> Hi, Could someone please have a look at the each_entry method of io/fastacmd.rb (in cvs)? The code below gives the sequences of 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no output. Any ideas? fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name") seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2']) seqs.each do |seq| puts seq => works fine end fastacmd.each_entry do |fasta| puts 'hi' => it never seems to get here... end Thanks, Jan Aerts, PhD Bioinformatics Group Roslin Institute Roslin, Scotland, UK +44 131 527 4200 ---------The obligatory disclaimer-------- The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From ngoto at gen-info.osaka-u.ac.jp Wed Mar 22 05:29:27 2006 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Wed, 22 Mar 2006 19:29:27 +0900 Subject: [BioRuby] fastacmd.rb: iteration In-Reply-To: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk> References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk> Message-ID: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp> Hi jan, I found a bug in the Bio::FlatFile. Because io/fastacmd.rb internally uses FlatFile, the bug may be related to the problem. The bug is that IO#pos raises error when the IO object isn't a regular file (e.g. pipe) but FlatFile always tried to get pos. It is fixed in the CVS now. On Tue, 21 Mar 2006 12:38:30 -0000 "jan aerts \(RI\)" wrote: > Hi, > > Could someone please have a look at the each_entry method of > io/fastacmd.rb (in cvs)? The code below gives the sequences of > 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no > output. Any ideas? > > fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name") > seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2']) > seqs.each do |seq| > puts seq => works fine > end > > fastacmd.each_entry do |fasta| > puts 'hi' => it never seems to get here... > end > > Thanks, > Jan Aerts, PhD > Bioinformatics Group > Roslin Institute > Roslin, Scotland, UK > +44 131 527 4200 > > ---------The obligatory disclaimer-------- > The information contained in this e-mail (including any attachments) is > confidential and is intended for the use of the addressee only. The > opinions expressed within this e-mail (including any attachments) are > the opinions of the sender and do not necessarily constitute those of > Roslin Institute (Edinburgh) ("the Institute") unless specifically > stated by a sender who is duly authorised to do so on behalf of the > Institute. -- Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Japan From k at bioruby.org Sat Mar 25 03:01:05 2006 From: k at bioruby.org (Toshiaki Katayama) Date: Sat, 25 Mar 2006 17:01:05 +0900 Subject: [BioRuby] Important news for bioruby developers In-Reply-To: References: Message-ID: <0EA907FB-27BB-4A78-B173-D7F4F0AB4A85@bioruby.org> Hi Chris, Thank you for taking care of the server migration. On 2006/03/22, at 1:32, Chris Dagdigian wrote: > Hello, > > Sorry for the interruption but I've got some important site and server news. People will also see multiple copies of this note as I slowly transition sites over. > > We are in the midst of moving all of our websites, mailing lists, developers and sourcecode repositories onto more modern hardware located in a 2nd Boston area datacenter facility. > > This may not be a big deal for bioruby since your website, wiki and news site are not hosted by Open Bio. Keep reading though as there are some questions/favors I need to ask of the Ruby developers down below ... > > The transition is important for a couple of reasons - the most urgent being that we are going to lose internet connectivity in our current hosting facility on March 27th 2006. That datacenter belongs to Wyeth Research in Cambridge, Massachusetts. Wyeth Research & Genetics Institute have been long time significant supporters & hosting providers for OBF servers and projects -- we owe them a great deal of gratitude and public acknowledgment for hosting our servers over many years. Speaking as a hardware geek I can tell you that the many years of high-bandwidth, trouble free hosting have been invaluable for our efforts and projects. Sadly, it is no longer possible for them to host our servers as they need to begin making some network and WAN circuit changes that will no longer support direct internet facing servers (such as ours) in Cambridge. > > The other major reason for the transition is our need to relocate onto hardware that can better be remotely managed (as our volunteer administrators are scattered all over the globe). > > My employer, BioTeam Inc. has donated new server hardware and is also providing the hosting facilities in a Tier 1 Boston area colocation facility. Infrastructure geeks can see pictures of the colocation cage and the new OBF servers online at this URL: > http://bioteam.net/gallery/bioteamBDC -- those servers also host EMBOSS FTP/CVS and mailing lists. > > Current status of the migration: > > - All 57 mailing lists have been moved over to the new hardware (you may have noticed "lists.open-bio.org" showing up in your list messages) > > - The new anonymous sourcecode server is running at http://code.open-bio.org. "cvs.biodas,.org" is already pointing at it. > > - Developers with CVS accounts have *NOT* been migrated yet > > Basically we are trying to relocate everything but the developers over the next few days so we can spend the weekend on the developer and CVS transition. > > > Attention BioRuby Developers > ----------------------------------------- > > I need assistance with the following: > > (1) Please confirm to me or support at open-bio.org that you have NO websites running on Open Bio servers. It appears you host your own wiki/news/web sites Yes, we have NO websites on Open Bio servers for now. > (2) Please change your website front page to reflect the new URLs for your mailing lists: > > http://lists.open-bio.org/mailman/listinfo/bioruby > http://lists.open-bio.org/mailman/listinfo/bioruby-cvs > http://lists.open-bio.org/mailman/listinfo/bioruby-ja Done. > (3) Please CNAME alias or web forward "cvs.bioruby.org" to code.open-bio.org to use t Does this mean "cvs.open-bio.org" is no longer available or not synced with "code.open-bio.org"? Currently, we forward "http://cvs.bioruby.org" to "http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/?cvsroot=bioruby" > (4) However you do your mail forwarding, please make sure that mail for bioruby.org mailing lists gets redirected to "lists.open-bio.org" Done. > For people with CVS commit/write access > --------------------------------------------------------- > Also note that when we finally do transition over to the new developer machine (where the real sourcecode lives), ALL developers will need to email support at open-bio.org to request a password reset. Although we can transition usernames, settings and home directories over from the old to the new machine we can not transition over existing passwords as they are stored in incompatible hashed formats. All developers are going to need new passwords for the new developer machine. We will likely make the developer machine swap this weekend. > > > Reporting Problems / Help & Assistance > ------------------------------------------------------ > The transition will be complicated, we need your help to spot problems and glitches! The OBF has a new helpdesk ticketing system set up at "support at open-bio.org" so that all OBF admins can read and respond to issues and problems. Most troubles should be reported to that address. For urgent problems, especially during this transition period, feel free to contact me directly (dag at sonsorol.org) (ichat/aol/aim screen name: bioteamdag). > > > Regards, > Chris Dagdigian > open-bio.org > > > > > > > > > > > > > > > > From k at bioruby.org Sat Mar 25 19:46:18 2006 From: k at bioruby.org (Toshiaki Katayama) Date: Sun, 26 Mar 2006 09:46:18 +0900 Subject: [BioRuby] fastacmd.rb: iteration In-Reply-To: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp> References: <84DA9D8AC9B05F4B889E7C70238CB451030DAC2C@rie2ksrv1.ri.bbsrc.ac.uk> <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp> Message-ID: Goto-san, # post testing for new open-bio.org server :) I've suggested to release 1.0.1 (or create a stable branch?) How do you think? This bug causes Bio::FlatFile with ARGF to fail at the last iteration and it may be fairly serious problem for many users. By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but Bio::FlatFile.new doesn't. Is there any reason to disallow the feature? Toshiaki -------------------------------------------------- % cat test_ff.rb require 'bio' ff = Bio::FlatFile.new(Bio::FastaFormat, ARGF) ff.each do |e| p e.definition end % cat test.fa >b0002 atgcgagtgtt >b0003 atggttaaagt >b0004 atgaaactcta % ruby test_ff.rb test.fa "b0002" "b0003" "b0004" /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos': no stream to tell (ArgumentError) from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:118:in `pos' from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:342:in `get_entry' from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:573:in `next_entry' from /usr/local/lib/ruby/site_ruby/1.8/bio/io/flatfile.rb:609:in `each' from test_ff.rb:4 -------------------------------------------------- On 2006/03/22, at 19:29, GOTO Naohisa wrote: > Hi jan, > > I found a bug in the Bio::FlatFile. Because io/fastacmd.rb > internally uses FlatFile, the bug may be related to the problem. > > The bug is that IO#pos raises error when the IO object isn't > a regular file (e.g. pipe) but FlatFile always tried to get pos. > It is fixed in the CVS now. > > On Tue, 21 Mar 2006 12:38:30 -0000 > "jan aerts \(RI\)" wrote: > >> Hi, >> >> Could someone please have a look at the each_entry method of >> io/fastacmd.rb (in cvs)? The code below gives the sequences of >> 'id_of_entry1' and 'id_of_entry2', but the each_entry method gives no >> output. Any ideas? >> >> fastacmd = Bio::Blast::Fastacmd.new("/path_to_my_db/db_name") >> seqs = fastacmd.fetch(['id_of_entry1','id_of_entry2']) >> seqs.each do |seq| >> puts seq => works fine >> end >> >> fastacmd.each_entry do |fasta| >> puts 'hi' => it never seems to get here... >> end >> >> Thanks, >> Jan Aerts, PhD >> Bioinformatics Group >> Roslin Institute >> Roslin, Scotland, UK >> +44 131 527 4200 >> >> ---------The obligatory disclaimer-------- >> The information contained in this e-mail (including any attachments) is >> confidential and is intended for the use of the addressee only. The >> opinions expressed within this e-mail (including any attachments) are >> the opinions of the sender and do not necessarily constitute those of >> Roslin Institute (Edinburgh) ("the Institute") unless specifically >> stated by a sender who is duly authorised to do so on behalf of the >> Institute. > > -- > Naohisa GOTO > ngoto at gen-info.osaka-u.ac.jp > Department of Genome Informatics, Genome Information Research Center, > Research Institute for Microbial Diseases, Osaka University, Japan > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Sat Mar 25 20:28:14 2006 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sun, 26 Mar 2006 10:28:14 +0900 Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio command.rb, 1.3, 1.4) References: <200603201035.k2KAYxVL030067@pub.open-bio.org> Message-ID: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp> Goto-san, > + # Same as OpenURI.open_uri(*arg). > + # If open-uri.rb is already loaded, ::OpenURI is used. > + # Otherwise, internal OpenURI in sandbox is used because > + # open-uri.rb redefines Kernel.open. Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path, searching a particular method from it etc... I don't understand what the complicated part of your Sandbox module actually does (or intends), but if your purpose is just to avoid redefine of Kenel.open, to put something like > require 'open-uri' > > module Kernel > private > alias open_uri open > alias open open_uri_original_open > end isn't enough? Regards, Toshiaki Katayama -----test_open_uri.rb #!/usr/bin/env ruby require 'open-uri' module Kernel private alias open_uri open alias open open_uri_original_open end url = "http://bioruby.org" p "########## open_uri" open_uri(url) do |f| puts f.read end p "########## open" open(url) do |f| puts f.read end Begin forwarded message: > From: Naohisa Goto > Date: 2006?3?20? 19:34:59:JST > To: bioruby-cvs at portal.open-bio.org > Subject: [BioRuby-cvs] bioruby/lib/bio command.rb,1.3,1.4 > > Update of /home/repository/bioruby/bioruby/lib/bio > In directory pub.open-bio.org:/tmp/cvs-serv30042/lib/bio > > Modified Files: > command.rb > Log Message: > * New module Bio::Command::NetTools for miscellaneous network methods. > Currently, this module is intended to be used only inside > BioRuby library. Please do not use it in user's programs now. > * New methods: Bio::Command::NetTools.open_uri(uri, *arg) and > Bio::Command::NetTools.read_uri(uri). > * Changed license to Ruby's. > > > Index: command.rb > =================================================================== > RCS file: /home/repository/bioruby/bioruby/lib/bio/command.rb,v > retrieving revision 1.3 > retrieving revision 1.4 > diff -C2 -d -r1.3 -r1.4 > *** command.rb 4 Nov 2005 17:36:00 -0000 1.3 > --- command.rb 20 Mar 2006 10:34:57 -0000 1.4 > *************** > *** 2,32 **** > # = bio/command.rb - general methods for external command execution > # > ! # Copyright:: Copyright (C) 2003-2005 > # Naohisa Goto , > # Toshiaki Katayama > ! # License:: LGPL > # > # $Id$ > # > - #-- > - # > - # This library is free software; you can redistribute it and/or > - # modify it under the terms of the GNU Lesser General Public > - # License as published by the Free Software Foundation; either > - # version 2 of the License, or (at your option) any later version. > - # > - # This library is distributed in the hope that it will be useful, > - # but WITHOUT ANY WARRANTY; without even the implied warranty of > - # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - # Lesser General Public License for more details. > - # > - # You should have received a copy of the GNU Lesser General Public > - # License along with this library; if not, write to the Free Software > - # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > - # > - #++ > - # > > require 'open3' > > module Bio > --- 2,15 ---- > # = bio/command.rb - general methods for external command execution > # > ! # Copyright:: Copyright (C) 2003-2006 > # Naohisa Goto , > # Toshiaki Katayama > ! # License:: Ruby's > # > # $Id$ > # > > require 'open3' > + require 'uri' > > module Bio > *************** > *** 162,165 **** > --- 145,291 ---- > > end # module Tools > + > + > + # = Bio::Command::NetTools > + # > + # Bio::Command::NetTools is a collection of miscellaneous methods > + # for data transport through network. > + # > + # Library internal use only. Users should not directly use it. > + # > + # Note that it is under construction. > + module NetTools > + > + # Same as OpenURI.open_uri(*arg). > + # If open-uri.rb is already loaded, ::OpenURI is used. > + # Otherwise, internal OpenURI in sandbox is used because > + # open-uri.rb redefines Kernel.open. > + def self.open_uri(uri, *arg) > + if defined? ::OpenURI > + ::OpenURI.open_uri(uri, *arg) > + else > + SandBox.load_openuri_in_sandbox > + uri = uri.to_s if ::URI::Generic === uri > + SandBox::OpenURI.open_uri(uri, *arg) > + end > + end > + > + # Same as OpenURI.open_uri(uri).read. > + # If open-uri.rb is already loaded, ::OpenURI is used. > + # Otherwise, internal OpenURI in sandbox is used becase > + # open-uri.rb redefines Kernel.open. > + def self.read_uri(uri) > + self.open_uri(uri).read > + end > + > + # Sandbox to load open-uri.rb. > + # Internal use only. > + module SandBox #:nodoc: > + > + # Dummy module definition. > + module Kernel #:nodoc: > + # dummy method > + def open(*arg); end #:nodoc: > + end #module Kernel > + > + # a method to find proxy. dummy definition > + module FindProxy; end #:nodoc: > + > + # dummy module definition > + module OpenURI #:nodoc: > + module OpenRead; end #:nodoc: > + end #module OpenURI > + > + # Dummy module definition. > + module URI #:nodoc: > + class Generic < ::URI::Generic #:nodoc: > + include SandBox::FindProxy > + end > + > + class HTTPS < ::URI::HTTPS #:nodoc: > + include SandBox::FindProxy > + include SandBox::OpenURI::OpenRead > + end > + > + class HTTP < ::URI::HTTP #:nodoc: > + include SandBox::FindProxy > + include SandBox::OpenURI::OpenRead > + end > + > + class FTP < ::URI::FTP #:nodoc: > + include SandBox::FindProxy > + include SandBox::OpenURI::OpenRead > + end > + > + # parse and new. internal use only. > + def self.__parse_and_new__(klass, uri) #:nodoc: > + scheme, userinfo, host, port, > + registry, path, opaque, query, fragment = ::URI.split(uri) > + klass.new(scheme, userinfo, host, port, > + registry, path, opaque, query, > + fragment) > + end > + private_class_method :__parse_and_new__ > + > + # same as ::URI.parse. internal use only. > + def self.parse(uri) #:nodoc: > + r = ::URI.parse(uri) > + case r > + when ::URI::HTTPS > + __parse_and_new__(HTTPS, uri) > + when ::URI::HTTP > + __parse_and_new__(HTTP, uri) > + when ::URI::FTP > + __parse_and_new__(FTP, uri) > + else > + r > + end > + end > + end #module URI > + > + @load_openuri = nil > + # load open-uri.rb in SandBox module. > + def self.load_openuri_in_sandbox #:nodoc: > + return if @load_openuri > + fn = nil > + unless $:.find do |x| > + fn = File.join(x, 'open-uri.rb') > + FileTest.exist?(fn) > + end then > + warn('Warning: cannot find open-uri.rb in $LOAD_PATH') > + else > + # reading open-uri.rb > + str = File.read(fn) > + # eval open-uri.rb contents in SandBox module > + module_eval(str) > + > + # finds 'find_proxy' method > + find_proxy_lines = nil > + flag = nil > + endstr = nil > + str.each do |line| > + if flag then > + find_proxy_lines << line > + if endstr == line[0, endstr.length] and > + /^\s+end(\s+.*)?$/ =~ line then > + break > + end > + elsif /^(\s+)def\s+find_proxy(\s+.*)?$/ =~ line then > + flag = true > + endstr = "#{$1}end" > + find_proxy_lines = line > + end > + end > + if find_proxy_lines > + module_eval("module FindProxy;\n#{find_proxy_lines}\n;end\n") > + else > + warn('Warning: cannot find find_proxy method in open-uri.rb.') > + end > + @load_openuri = true > + end > + end > + end #module SandBox > + end #module NetTools > + > end # module Command > end # module Bio > > _______________________________________________ > bioruby-cvs mailing list > bioruby-cvs at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby-cvs From ngoto at gen-info.osaka-u.ac.jp Sun Mar 26 01:13:28 2006 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto) Date: Sun, 26 Mar 2006 15:13:28 +0900 Subject: [BioRuby] fastacmd.rb: iteration In-Reply-To: References: <200603221029.k2MATTaI028477@idns103.gen-info.osaka-u.ac.jp> Message-ID: <20060326142807.5F16.NGOTO@gen-info.osaka-u.ac.jp> Hi, > I've suggested to release 1.0.1 (or create a stable branch?) > How do you think? I agree. > By the way, Bio::FlatFile.auto and Bio::FlatFile.open accept a block but > Bio::FlatFile.new doesn't. Is there any reason to disallow the feature? I referred specifications of Ruby's File, IO and Dir classes. File.open, IO.open, and Dir.open can accept a block but File.new, IO.new, and Dir.new don't. Because Ruby's experts have determined such specifications, I suppose that there may be something merits not to accept blocks or there may be something problems to accept a block, but I don't know much about them. [ruby-list:24986] said that in Ruby 1.6.0, IO.new and Dir.new was changed not to take block, but I can't find the reason. ( http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/24986 ) -- Naohisa Goto ngoto at gen-info.osaka-u.ac.jp From ngoto at gen-info.osaka-u.ac.jp Sun Mar 26 01:56:07 2006 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto) Date: Sun, 26 Mar 2006 15:56:07 +0900 Subject: [BioRuby] open_uri (Fwd: [BioRuby-cvs] bioruby/lib/bio command.rb, 1.3, 1.4) In-Reply-To: <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp> References: <200603201035.k2KAYxVL030067@pub.open-bio.org> <29B466AB-B4B0-4FA4-B6C2-A680D1A9B637@hgc.jp> Message-ID: <20060326155527.5F1B.NGOTO@gen-info.osaka-u.ac.jp> > > + # Same as OpenURI.open_uri(*arg). > > + # If open-uri.rb is already loaded, ::OpenURI is used. > > + # Otherwise, internal OpenURI in sandbox is used because > > + # open-uri.rb redefines Kernel.open. > > Your code seems to contain a lot of hacks, finding open-uri.rb from Ruby's load path, > searching a particular method from it etc... Yes, it's very complicated. It's easier to copy-and-pastepart of open-uri.rb but this may cause copyright problem. The easiest way is to write "Please be careful that BioRuby now require open-uri and the bahavior of open() is changed." in the documents of BioRuby and uses OpenURI.open_uri. In very rare case, the require "oprn-uri" changes behaviors. For example, % mkdir -p http://www.google.com % echo "hello world" > http://www.google.com/index.html % ls -R .: http:/ ./http:: www.google.com/ ./http:/www.google.com: index.html % irb irb(main):001:0> open("http://www.google.com/index.html"){|f| f.read } => "hello world\n" irb(main):002:0> require "open-uri" => true irb(main):003:0> open("http://www.google.com/index.html"){|f| f.read } => "Google\n