[Bioperl-l] Re: Automatic generation of set and get methods

Thu, 14 Nov 2002 23:52:58 -0800 (PST)

--- Hilmar Lapp <hlapp@gnf.org> wrote:
> 
> 
> > -----Original Message-----
> > From: Lincoln Stein [mailto:lstein@cshl.org]
> > Sent: Thursday, November 14, 2002 12:12 PM
> > To: natg@shore.net
> > Cc: bioperl-l@bioperl.org; lstein@lsjs.org; jgrg@sanger.ac.uk
> > Subject: [Bioperl-l] Re: Automatic generation of set and get methods
> > 
> [...]
> > One problem with the code you quote from Bio::Index::Abstract 
> > is that it 
> > provides no way to unset an instance variable.  For example:
> > 
> > 	$object->foo(undef);
> > 
> > will *NOT* work, because the argument is undefined.  I 
> > consider this a 
> > misfeature.  We had a thread about this on bioperl-l mailing 
> > list a month or 
> > so ago and I'm surprised it isn't fixed by now.
> 
> Every developer is more than welcome to fix it, and fellow core people even
> more so.
> 
> My personal take on this is that I will not go into everyone else's code and
> change his/her getter/setters. For my own code I do employ both idioms. It's
> not that seldom that I don't want an attribute's value to be changeable to
> undef.

Perhaps we could enable generation of three types of accessors:

   * sub <field>       = combined set/get
   * sub get_<field>   = read-only accessor
   * sub set_<field>   = write-only accessor

A module author would need a way to declare which fields are read-only, or
declare that all fields in a module are read-only, in which case only the
get_<field> accessors would be available.

But your objection gets at the more general problem of not being able to
validate what value gets set if the setter is autogenerated. I present a
solution to this below.

> >  A better 
> > idiom is this:
> > 
> > 
> >          *$func = sub {
> > 	     my $self = shift;
> >              my $current = $self->{$field};
> >              $self->{$field} = shift if @_;
> >              return $current;
> >      }
> > 
> > This also has the advantage of preserving the semantics of 
> > simultaneous get-and-set, and is consistent with LWP and other widely used 
> > modules.  The get part returns the existing value of the field, and the set
> 
> > part sets the  new value.
> 
> The conclusion of said thread was that in Bioperl you should not rely on the
> return value of a set call. I'm not in favor of making returning the previous
> value a supported convention - I think it creates an unnecessary burden on
> code review.

Tim Bunce's suggestion was a little different:

    *$func = sub {
        my $self = shift;
        return $self->{$field} = shift if @_;
        return $self->{$field};
    }

So here you don't get the previous value, you get the new value. Returning the
previous value is more convenient for situations when you want to change a
value, do something, and then set it back to what it was. E.g.,

    $oldval = $object->attribute( $newval );
    $object->do_something();
    $object->attribute( $oldval );

But it's not much harder to do this:

    $oldval = $object->attribute();
    $object->attribute( $newval );
    $object->do_something();
    $object->attribute( $oldval );

The latter has the advantage of working regardless of what the setter returns. 

I agree with Hilmar that in Bioperl as it currently stands, you shouldn't rely
on the return value from a setter (unless you know what you're doing). If,
however, all modules used autogenerated accessors, you could then rely on the
set return value, whatever it is. This would make Bioperl modules more
predictable, which would be a good thing.

> > >
> > > This seems so elegant, I wonder why it's not used 
> > everywhere in BioPerl. 
> > > Is there some hidden problem?
> 
> My personal take on this is whether this is elegant or not is a matter of
> personal taste; to me for instance it's not. There is a risk of things
> becoming obfuscated to other programmers, but that's maybe my personal
> opinion only. I'm a strong proponent of explicit software APIs and code. I
> have to say that never in my programming life have I really experienced
> writing getter/setters to be the time-limiting factor. Overall design,
> writing the business logic, fixing the bugs, and writing the documentation
> take 98% of the whole time, not writing getter/setters. Writing those is
> quite boring, but it's not time consuming. YMMV though.
> 

I also like the explicitness of having all methods including accessors
available in the source. But I've also felt at times burdened by cutting and
pasting nearly identical code for every field. Here's my accounting of the
pluses and minuses to help decide the best way to go. Feel free to add &
comment:

Advantages of accessor autogeneration:

   * Frees developer from drudgery of writing routine accessor code
   * Reduces chance of bugs within accessor code (e.g., from cut-n-paste)
   * Guarantees predictability of accessors across all modules
   * Simple tests could be autogenerated
   * Documentation template could be autogenerated
   * Reduces amount of routine code within a module so you can focus on the 
     business logic.

Disadvantages of autogeneration of accessors:

   * Code is less explicit; can't see all methods when reading module source
   * Can't validate data when setting (but see below)
   * Runtime bugs within autogenerated code harder to track down
   * Performace hit (maybe, not sure about this)

So I'm leaning towards autogeneration. I think it would improve the reliability
and uniformity of Bioperl modules and allow developers to focus on the
"interesting" parts of the code. If we enable the flexibility of specifying
read/write vs. read-only accessors as I mentioned above, that would be great.

The lack of validation within autogenerated accessors is an important issue. It
gets at Hilmar's complaint of not always wanting to allow a field to take on a
value of undef. It could be formalized within the autogenerated setter using
something like this:

    *$func = sub {
        my $self = shift;
        my $current = $self->{$field};
        if(@_) {
            $self->{$field} = shift;
            if( $self->can("validate_$field") ) {
                $self->validate_$field; # throws exception if trouble
            }
        }
        return $current;
    }

Then you hand-code the sub validate_<field> methods to do all the checking you
desire. We could autogenerate the throw call on validation failure if we
specify that the validate method should return nothing on success and a string
on failure explaining the reason for failure:

            if( $self->can("validate_$field") ) {
                if( my $msg = $self->validate_$field ) {

                }
            }

=====
Steve Chervitz
sac@bioperl.org

__________________________________________________
Do you Yahoo!?
Yahoo! Web Hosting - Let the expert host your site
http://webhosting.yahoo.com