Reverse Complement utility, Bio::Alg, return value problem

Steve A. Chervitz sac@genome.stanford.edu
Thu, 7 Aug 1997 13:46:26 -0700 (PDT)


Georg wrote:

> SteveB wrote,
> > I think that we should determine, for ALL functions, wether they modify
> > existing objects or return new ones.
> 
> Or, whether they return the actual sequence..
> >>>>>>>>
> > $myseq->revcom($beg,$end,'inplace')
> This makes sense, but is pretty verbose.  Also, this would suggest that
> the return would alwasy be the object itself, whereas you might want to be
> returned the actual sequence, or some status value.
> <<<<<<<<
> 
> Also, modification of the existing object can save so much space that we
> should have it, and I hope the way it's currently done in UnivAln.pm
> (i.e. setting the flag using a _method_ inplace() which does _not_ need 
> to be exported) is OK.
> 
> The open problem in my view is whether revcom should return a sequence or
> an object. Both have advantages and disadvantages, and using copy()
> you can simulate one behaviour by the other. (If revcom accesses the sequence
> but you need the object, create an identical copy and do revcom() inplace; 
> if revcom returns an object but you need the sequence, apply get_seq to the
> returned object. The former saves space, but the latter is more OO I guess).

I would favor returning an object. As Georg states, it is more intuitive 
and much less awkward from an OO point of view. This way it is always 
clear when you are dealing with an object or a string:

$myseq->revcom($beg,$end)->get_seq();

vs.

$myseqcopy = $myseq->copy;
$myseqcopy->inplace(0);
$myseqcopy->revcom($beg,$end);
$myseqcopy->inplace(1);

Ugh!

Regarding the issue of methods that modify an existing object, 
I would argue that such methods should be flagged with a "set" prefix so   
it is absolutely clear what is being done: 

$myseq->set_revcom($beg,$end);  

would change the sequence object into its reverse complement. It could 
also return the altered object, too.

The advantages I see would be:

1) One method call replaces three; set_revcom() would call inplace() for 
   you. 
2) Objects are less likely to be inadvertantly altered (or not altered) 
   due to a missplaced or incorrect inplace() call. Requiring calls to 
   inplace(1) and inplace(0) forces the client to do the accounting and 
   thus can lead to a new class of bugs and maintenance headaches.

A disadvantage would be having two methods (set_revcom() and revcom()) 
instead of one, which you would need to have for every accessor. 
But this is more in line with OO design. The inplace() calls would still 
be useful when performing complex, multi-step manipulations.  

SteveC