[BioRuby] restriction enzyme module

Toshiaki Katayama ktym at hgc.jp
Tue Apr 3 16:30:20 EDT 2007


Trevor,

I think that we are ready for the next release on core part,
but I have some requests on the coding style of your modules
to be included in the release.

Firstly, your code is well documented but you have a lot of
sub classes under the Bio::RestrictionEnzyme name space.
It looks very complicated and I still don't understand which
class to use depending on the situations.
Could you describe the typical use cases?
I hope to include your description in the tutorial, if possible.

module Bio
  class RestrictionEnzyme
    module CutSymbol
      class CutSymbol__
    module StringFormatting
    class Fragments < Array
    class Analysis
    class DoubleStranded
      class EnzymeAction (empty?)
      class AlignedStrands
      class CutLocationPair < Array
      class CutLocationPairInEnzymeNotation < CutLocationPair
      class CutLocations < Array
      class CutLocationsInEnzymeNotation < CutLocations
    class SingleStrand < Bio::Sequence::NA
      class CutLocationsInEnzymeNotation < Array
    class SingleStrandComplement < SingleStrand (only def orientation; [3, 5]; end)
    class Range
      class CutRange (empty?)
      class CutRanges < Array
      class HorizontalCutRange < CutRange
      class SequenceRange
        class CalculatedCuts
        class Fragment
        class Fragments < Array
      class VerticalCutRange < CutRange


Secondly, I don't like to have code to modify the RUBYLIB in the library.

lib/bio/util/restriction_enzyme/analysis.rb
lib/bio/util/restriction_enzyme/analysis_basic.rb
lib/bio/util/restriction_enzyme/double_stranded.rb
lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_locations.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_locations_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/range/cut_range.rb
lib/bio/util/restriction_enzyme/range/cut_ranges.rb
lib/bio/util/restriction_enzyme/range/horizontal_cut_range.rb
lib/bio/util/restriction_enzyme/range/sequence_range.rb
lib/bio/util/restriction_enzyme/range/sequence_range/calculated_cuts.rb
lib/bio/util/restriction_enzyme/range/sequence_range/fragment.rb
lib/bio/util/restriction_enzyme/range/sequence_range/fragments.rb
lib/bio/util/restriction_enzyme/range/vertical_cut_range.rb
lib/bio/util/restriction_enzyme/single_strand.rb
lib/bio/util/restriction_enzyme/single_strand/cut_locations_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/single_strand_complement.rb
lib/bio/util/restriction_enzyme/string_formatting.rb
--------------------------------------------------
require 'pathname'
libpath = Pathname.new(File.join(File.dirname(__FILE__), ['..'] * 4, 'lib')).cleanpath.to_s
$:.unshift(libpath) unless $:.include?(libpath)
--------------------------------------------------

Why these lines are needed?
Loading all modules under the restriction_enzyme directory in
lib/bio/util/restriction_enzyme.rb isn't enough?


lib/bio/util/restriction_enzyme/integer.rb
--------------------------------------------------
class Integer #:nodoc:
  def negative?
    self < 0
  end
end
--------------------------------------------------

I don't like to modify the Ruby's build-in classes in BioRuby library.
Besides, this file only contains the above definition and
the method doesn't seem to worth to be added.
You used this method only few times and

  if (a != nil and a.negative?) or (b != nil and b.negative?)

is enough to be

  if (a != nil and a < 0) or (b != nil and b < 0)

How do you think?


lib/bio/util/restriction_enzyme/analysis_basic.rb
--------------------------------------------------
require 'pp'
--------------------------------------------------

Why you need 'pp' library here? (It seems that the module is not used.)


lib/bio/util/restriction_enzyme/range/cut_range.rb
--------------------------------------------------
module Bio; end
class Bio::RestrictionEnzyme
class Range
class CutRange
end # CutRange
end # Range
end # Bio::RestrictionEnzyme
--------------------------------------------------

Why this empty class need to be present?


lib/bio/util/restriction_enzyme/analysis_basic.rb
--------------------------------------------------
  # Creates an array of EnzymeActions based on the DNA sequence and supplied enz
ymes.
  #
  # ---
  # *Arguments*
  # * +sequence+: The string of DNA to match the enzyme recognition sites agains
t
  # * +args+:: The enzymes to use.
  # *Returns*:: +Array+ with the first element being an array of EnzymeAction ob
jects that +sometimes_cut+, and are subject to competition.  The second is an ar
ray of EnzymeAction objects that +always_cut+ and are not subject to competition
.
--------------------------------------------------

Could you fold your RDoc documents less than 80 columns as long as possible?
Or should I use larger terminal width...?


And, you looks to have different strategies for RDoc documentation on each file.
Which is the best practice?

lib/bio/util/restriction_enzyme.rb
--------------------------------------------------
module Bio #:nodoc:
--------------------------------------------------

or

lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair.rb
--------------------------------------------------
module Bio; end
--------------------------------------------------

or

lib/bio/util/restrction_enzyme/cut_symbol.rb
--------------------------------------------------
nil # to separate file-level rdoc from following statement # !> useless use of n
il in void context
--------------------------------------------------

Besides, many of your modules have duplicated header lines
(module name, authors, copyright etc.).
However, this is not the way documented in the README.DEV file
(included in the BioRuby distribution).
Why you do so?

lib/bio/util/color_scheme.rb
lib/bio/util/contingency_table.rb
lib/bio/util/restriction_enzyme.rb
lib/bio/util/restrction_enzyme/analysis.rb
lib/bio/util/restriction_enzyme/analysis_basic.rb
lib/bio/util/restriction_enzyme/cut_symbol.rb
lib/bio/util/restriction_enzyme/double_stranded.rb
lib/bio/util/restriction_enzyme/double_stranded/aligned_strands.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_location_pair_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_locations.rb
lib/bio/util/restriction_enzyme/double_stranded/cut_locations_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/range/cut_range.rb
lib/bio/util/restriction_enzyme/range/cut_ranges.rb
lib/bio/util/restriction_enzyme/range/horizontal_cut_range.rb
lib/bio/util/restriction_enzyme/range/sequence_range.rb
lib/bio/util/restriction_enzyme/range/sequence_range/calculated_cuts.rb
lib/bio/util/restriction_enzyme/range/sequence_range/fragment.rb
lib/bio/util/restriction_enzyme/range/sequence_range/fragments.rb
lib/bio/util/restriction_enzyme/range/vertical_cut_range.rb
lib/bio/util/restriction_enzyme/single_strand.rb
lib/bio/util/restriction_enzyme/single_strand/cut_locations_in_enzyme_notation.rb
lib/bio/util/restriction_enzyme/single_strand_complement.rb
lib/bio/util/restriction_enzyme/string_formatting.rb
--------------------------------------------------
#
# bio/util/restrction_enzyme/cut_symbol.rb - Defines the symbol used to mark a c
ut in an enzyme sequence
#
# Author::    Trevor Wennblom  <mailto:trevor at corevx.com>
# Copyright:: Copyright (c) 2005-2007 Midwinter Laboratories, LLC (http://midwin
terlabs.com)
# License::   Distributes under the same terms as Ruby
#
#  $Id: cut_symbol.rb,v 1.4 2007/01/01 05:07:04 trevor Exp $
#

(snip)

#
# bio/util/restrction_enzyme/cut_symbol.rb - Defines the symbol used to mark a c
ut in an enzyme sequence
#
# Author::    Trevor Wennblom  <mailto:trevor at corevx.com>
# Copyright:: Copyright (c) 2005-2007 Midwinter Laboratories, LLC (http://midwin
terlabs.com)
# License::   Distributes under the same terms as Ruby
#
--------------------------------------------------


lib/bio/util/restriction_enzyme/analysis_basic.rb
--------------------------------------------------
class Bio::Sequence::NA
  # NOTE: move this into Bio::Sequence::NA
  def cut_with_enzyme(*args)
    Bio::RestrictionEnzyme::Analysis.cut(self, *args)
  end
  alias cut_with_enzymes cut_with_enzyme
end
--------------------------------------------------

When do you plan to move this into lib/bio/sequence.rb?


Sorry for many questions and thanks in advance.

Toshiaki




More information about the BioRuby mailing list