[Bioperl-l] new_fast methods

Albert Vilella avilella at gmail.com
Thu Feb 26 20:45:07 UTC 2009


Yes, we've got some new_fasts sprinkled around in Ensembl. Never got
to switch to arrays instead of hashes though.

Should your slims be merged into the main branch at some point?

On Thu, Feb 26, 2009 at 1:52 PM, Jason Stajich <jason at bioperl.org> wrote:
> FYI - I wrote some lightweight feature objects - there is a branch for it
> (lightweight_feature_branch) - these had a pretty significant speedup.
>
>  A lot of the overhead with sequence/feature/location creation since there
> are so many objects being created, so optimizing these features by using
> arrays instead of hashes for the data structure seemed to provide a pretty
> significant speedup as well.  Ensembl uses a fast_new as well, right?
> Bio::SeqFeature::Slim
>
> -jason
> On Feb 26, 2009, at 4:28 AM, Albert Vilella wrote:
>
>> Hi,
>>
>> I would like to ask for comments to the list on the convenience of
>> having "new_fast" methods in Bioperl.
>> If one does some profiling on Bioperl scripts that parse large
>> quantities of data, the "_rearrange" method stands out as a possible
>> easy point of optimization. There are parts of the code that call the
>> new method with explicit options. See for example:
>>
>> We should be able to create a "new_fast" method for this cases that
>> takes the ordering as given and doesn't call "_rearrange". This
>> wouldn't disrupt existing code that still calls "new".
>>
>> Comments?
>>
>> Bio/Seq/SeqWithQuality.pm
>>
>>  if (!$seq) {
>>     my $id;
>>     unless ($self->{supress_warnings} == 1) {
>>        $self->warn("You did not provide sequence information during the ".
>>          "construction of a Bio::Seq::SeqWithQuality object. Sequence ".
>>          "components for this object will be empty.");
>>     }
>>     if (!$alphabet) {
>>        $self->throw("If you want me to create a PrimarySeq object for your
>> ".
>>          "empty sequence <boggle> you must specify a -alphabet to satisfy
>> ".
>>          "the constructor requirements for a Bio::PrimarySeq object with
>> no ".
>>          "sequence. Read the POD for it, luke.");
>>     }
>>     $self->{seq_ref} = Bio::PrimarySeq->new( -seq              =>  "",
>>                                              -accession_number =>  $acc,
>>                                              -primary_id       =>  $pid,
>>                                              -desc             =>  $desc,
>>                                              -display_id       =>  $id,
>>                                              -alphabet         =>
>> $alphabet );
>>  } elsif ($seq->isa('Bio::PrimarySeqI') || $seq->isa('Bio::SeqI')) {
>>     $self->{seq_ref} = $seq;
>>  } elsif (ref($seq)) {
>>     $self->throw("You passed a seq argument into a SeqWithQUality object
>> and".
>>       " it was a reference ($seq) which did not inherit from Bio::SeqI or
>> ".
>>       "Bio::PrimarySeqI. I don't know what to do with this!");
>>  } else {
>>     my $seqobj = Bio::PrimarySeq->new( -seq              => $seq,
>>                                        -accession_number => $acc,
>>                                        -primary_id       => $pid,
>>                                        -desc             => $desc,
>>                                        -display_id       => $id   );
>>     $self->{seq_ref} = $seqobj;
>>  }
>>  # Then import the quality scores
>>  if (!defined($qual)) {
>>     $self->{qual_ref} = Bio::Seq::PrimaryQual->new( -qual             =>
>> "",
>>                                                     -accession_number =>
>> $acc,
>>                                                     -primary_id       =>
>> $pid,
>>                                                     -desc
>> => $desc,
>>                                                     -display_id
>> => $id, );
>>  } elsif (ref($qual) eq "Bio::Seq::PrimaryQual") {
>>     $self->{qual_ref} = $qual;
>>  } else {
>>     my $qualobj = Bio::Seq::PrimaryQual->new( -qual             => $qual,
>>                                               -accession_number => $acc,
>>                                               -primary_id       => $pid,
>>                                               -desc             => $desc,
>>                                               -display_id       => $id,
>>                                               -trace_indices    =>
>> $trace_indices );
>>     $self->{qual_ref} = $qualobj;
>>  }
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Jason Stajich
> jason at bioperl.org
>
>
>
>




More information about the Bioperl-l mailing list