[Fwd: Bioperl: manipulating long strings (genomes) in PERL]

Andrew Dalke dalke@bioreason.com
Tue, 30 Mar 1999 09:15:32 -0800


Matthew Pocock <mrp@sanger.ac.uk> said
> Or even
> 
> my $seq = join "", grep { chomp } <FH>;
> 
> which also avoids some re-alocation issues

as compared to Ian Korf <ikorf@sapiens.wustl.edu>'s:
> my $seq;
> LOCAL_BLOCK: {
>         my @seq = <FH>;
>         chomp @seq;
>         $seq = join("", @seq);
> }

If you're paying that much attention to detail, I believe your
new proposal has more reallocation issues than the old one.  In
your newer statement, you have

  grep {chomp} <FH>

which returns a different list  (it must, since if chomp returns
a false value, the list will be different, and it doesn't know
a priori if chomp will always return a true value).

Hence, you are allocating two lists, one for <FH> in array
context, and another for the return of grep.

The only thing the new one gets you is that the list returned
from <FH> is implicit, rather than explicitly stored in @seq.
The list is constructed in either case, all it saves you is a
small allocation for storing the "seq" information.

Also, the new proposal calls for additional function calls,
since grep must call each term of the <FH> array at the Perl
level (unless it was special case optimized) while the older
proposal has the whole list chomped at the C level.

There's also the downside that the new version is much less
maintainable, depending as it does on knowing the side-effects
of several perl constructs.

						Andrew
						dalke@bioreason.com
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================