[Bioperl-l] Bio::SeqIO -- add an ugly but fast grep hack?

Jay Hannah jay at jays.net
Fri Sep 29 12:49:53 UTC 2006


On Sep 14, 2006, at 10:58 AM, Amir Karger wrote:
> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> {
>>     local $/ = "//\n";
>>     while (my $gb = <>) {
>>         print $gb if $gb =~ m/Staphylococcus\sepidermidis/im;
>>     }
>> }
>
> Perl Golf! (Untested, as all good Perl Golf should be.)
>
> perl -wne 'BEGIN {$/="//\n"} print if /Staphylococcus\sepidermidis/ 
> im/'
> blah.gb > filtered.gb

Wow. You guys are amazing.

My version was a lot longer (Reverse Perl Golf!!):

    my @files = @{$self->{files}};
    my $file;
    foreach $file (@files) {
       open (IN, $file);
       my $locus;
       while (<IN>) {
          if (/^LOCUS/) {
             # A locus has begun.
             $locus = $_;
          } elsif (/^\/\//) {
             # A locus ends.
             $locus .= $_;
             if ($locus =~ /$args{grep}/s) {
                print OUT $locus;
             }
          } else {
             # A row inside a locus.
             $locus .= $_;
          }
       }
    }

I'm playing with an abstraction layer I'm calling "OpenLab". Here the  
grep() method does the work:

my $ol  = OpenLab->new();
# Load up just the "ATCC 12228" sequences from a directory...
my $ss1 = $ol->new_SequenceSet(name => "Organism1");
$ss1->load(files => "$data_dir/*");
$ss1->grep(
    grep    => "ATCC 12228",
    storage => $data_tmp
);

I'll be replacing the implementation inside my class with your wizardry.

Thanks!

j
12 years of Perl later, still learning new tricks.  :)





More information about the Bioperl-l mailing list