[Bioperl-l] Re: issues with _rearrange

Aaron J Mackey Aaron J. Mackey" <amackey@virginia.edu
Fri, 20 Sep 2002 10:53:32 -0400 (EDT)


There's two, orthogonal issues at stake here:

1) Ability of the _rearrange method to handle multiple styles of
arguments; as Matthew and others have suggested, we could break this into
a few different methods, specific for each style; this would accelerate
the _rearrange calls by essentially stripping it of unused code for each
type of style.

2) the fact that the method (any method) get's called many many times
incurs the penalty of Perl's relatively slow function calls (because of
all the setup and teardown Perl has to do for each call).

Solutions proposed so far:

A) export _rearrange to packages so that the @ISA hierarchy isn't
traversed; my feeling (yet to be proved right or wrong) is that this won't
help, as it doesn't address either #1 or #2 above.

B) make separate _rearrange subs for each mode of usage; addresses #1 but
not problem #2.

C) do away with _rearrange altogether, insisting on only one form of
arguments and writing that code "inline"; addresses #2 but not #1, adding
a barrier to module authors.

D) rewrite _rearrange in C; this doesn't actually help us at all, since
the only thing that _rearrange does is elemental manipulation of scalars;
Perl already does this internally with compiled C functions (see mjd's
perl.com article on when not to rewrite Perl functions in C); again, I'm
happy to be proved wrong here.

E) (latest greatest idea!) - achieve the effect of both B) and C) by
writing an optimizer (only used if optimizer.pm is available) that
inspects the op-tree for calls to _rearrange, inspects the arguments, and
replaces the entire function call with an "inlined" set of ops that does
specifically what's called for (either return the flat array @param, or
the hash slice @param{@$order} after keys have been modified); this could
only happen of course with literal keys.

I like E.  I like E alot, but it's non-trivial; it slows compilation time
a little bit, but is a big win during our many runtime loops ... this
approach could be extended to other cases where we want to do something
"directly" when we can, but not break encapsulation.

The drawback of E is that it would only be available if you go through the
steps necessary to install optimizer.pm (which seem to include running at
least perl 5.7.2 (or at the 5.8.0 stable release)).  But in an environment
where speed would really matter, this could be a tremendous win.

Thoughts?

-Aaron

On Fri, 20 Sep 2002, Matthew Pocock wrote:

> Could you auto-generate <foo>_prety subs that just call
> shift-><foo>(_rearange(@)), and have <foo> bypass _rearange totaly?
> People who want performance would call <foo> with an args list, and
> people who want nice names would call <foo>_prety with a hash. Perl
> makes this kind of thing quite easy.
>
> Matthew
>
> Steve Chervitz wrote:
> > --- Aaron J Mackey <ajm6q@virginia.edu> wrote:
> >
> >>On Thu, 19 Sep 2002, Aaron J Mackey wrote:
> >>
> >>
> >>>my ($self, @args) = @_;
> >>>my ($a1, $a2, $a3) = $self->_rearrange([qw(a1 a2 a3)], @args);
> >>>
> >>>becomes:
> >>>
> >>>my ($self, %args) = @_;
> >>>my ($a1, $a2, $a3) = @args{qw(a1 a2 a3)};
> >>
> >>Before anyone jumps down my throat, I'm aware of all the -a1, -A1, A1, a1
> >>options that _rearrange handles; things are never completely as simple as
> >>we first believe them to be ;)
> >>
> >
> >
> > Another thing to try is exporting _rearrange from RootI so that it can be used
> > as a class method, instead of an instance method, i.e., call it as _rearrange(
> > ... ) rather than $self->_rearrange( ... ). This should improve performance.
> >
> > I've always been concerned about all of the overhead from _rearrange calls, so
> > Aaron's suggestion seems reasonable to me. _rearrange is convenient, but maybe
> > it's a little *too* convenient. Allowing this much flexibility in specifying
> > method arguments is probably not necessary as many programmers and users are
> > accustomed to precisely defined arg lists.
> >
> > Typically, our method PODs state what the allowable arguments are without
> > stating that the hash keys are case-insensitive (and in some cases optional).
> > So I'd bet that most users don't know this flexibility exists and we're taking
> > a performance hit unnecessarily.
> >
> > Before deprecating _rearrange, I'd be interested to know how much the class
> > function strategy improves performance, and how much not calling it at all
> > further improves it.
> >
> >
> > =====
> > Steve Chervitz
> > sac@bioperl.org
> >
> > __________________________________________________
> > Do you Yahoo!?
> > New DSL Internet Access from SBC & Yahoo!
> > http://sbc.yahoo.com
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
>
>
>

-- 
 Aaron J Mackey
 Pearson Laboratory
 University of Virginia
 (434) 924-2821
 amackey@virginia.edu