[Bioperl-l] SeqIO profiling

Aaron J Mackey Aaron J. Mackey" <amackey@virginia.edu
Thu, 12 Sep 2002 15:40:38 -0400 (EDT)


here's some more fuel for the fire; this is a profiling run from a
super-simple script that invokes a SeqIO::fasta on a 24K sequence protein
database, and does nothing else but call next_seq ... 24K calls to
next_seq, but 120K calls to _rearrange and BEGIN!!

Total Elapsed Time = 30.84184 Seconds
         User Time = 30.51116 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 23.0   7.040  6.980 119910   0.0001 0.0001  Bio::Root::RootI::_rearrange
 11.6   3.539  5.530 119914   0.0000 0.0000  Bio::Root::RootI::BEGIN
 9.96   3.039 30.811  23982   0.0001 0.0013  Bio::SeqIO::fasta::next_seq
 8.81   2.689 12.976  47969   0.0001 0.0003  Bio::PrimarySeq::BEGIN
 5.67   1.730  1.658 143892   0.0000 0.0000  Bio::Root::Root::verbose
 5.21   1.589 20.041  23981   0.0001 0.0008  Bio::Seq::new
 3.77   1.150  3.434  47962   0.0000 0.0001  Bio::PrimarySeq::seq
 3.02   0.920  5.134  47962   0.0000 0.0001  Bio::SeqIO::alphabet
 2.92   0.890  2.454  23981   0.0000 0.0001  Bio::Annotation::TypeManager::new
 2.88   0.880  4.514  23981   0.0000 0.0002  Bio::Annotation::Collection::new
 2.75   0.840  0.792  95924   0.0000 0.0000  Bio::PrimarySeq::alphabet
 2.72   0.830  0.850 119917   0.0000 0.0000  Bio::Root::Root::BEGIN
 2.65   0.810 21.299  23981   0.0000 0.0009  Bio::Seq::SeqFactory::create
 2.52   0.770  0.734  71943   0.0000 0.0000  Bio::Annotation::TypeManager::_add
                                             _type_map
 2.43   0.740  1.336  23981   0.0000 0.0001  Bio::PrimarySeq::_guess_alphabet

here's what the call trace looks like, for each $io->next_seq call:

Bio::SeqIO::fasta::next_seq
   Bio::Root::IO::_readline
      File::Temp::BEGIN
   Bio::SeqIO::alphabet
   Bio::SeqIO::sequence_factory
   Bio::Seq::SeqFactory::create
      Bio::Seq::SeqFactory::type
      Bio::Root::Root::verbose
      Bio::Seq::new
         Bio::Root::RootI::new
            Bio::Root::Root::BEGIN
            Bio::Root::Root::verbose
         Bio::PrimarySeq::BEGIN
            Bio::Root::RootI::new
               Bio::Root::Root::BEGIN
               Bio::Root::Root::verbose
            Bio::Root::RootI::_rearrange
            Bio::PrimarySeq::seq
               Bio::Root::RootI::_rearrange
               Bio::PrimarySeq::validate_seq
               Bio::PrimarySeq::alphabet
               Bio::PrimarySeq::_guess_alphabet
                  Bio::PrimarySeq::seq
                     Bio::Root::RootI::_rearrange
                  Bio::PrimarySeq::alphabet
            Bio::PrimarySeq::display_id
            Bio::PrimarySeq::primary_id
            Bio::PrimarySeq::desc
         Bio::Root::RootI::_rearrange
         Bio::Seq::primary_id
         Bio::Annotation::Collection::new
            Bio::Root::RootI::new
               Bio::Root::Root::BEGIN
               Bio::Root::Root::verbose
            Bio::Annotation::TypeManager::new
               Bio::Root::RootI::new
                  Bio::Root::Root::BEGIN
                  Bio::Root::Root::verbose
               Bio::Annotation::TypeManager::_add_type_map
               Bio::Annotation::TypeManager::_add_type_map
               Bio::Annotation::TypeManager::_add_type_map
            Bio::Annotation::Collection::_typemap
         Bio::Seq::annotation
         Bio::Seq::primary_seq
            UNIVERSAL::isa
            Bio::Seq::top_SeqFeatures
   Bio::Seq::alphabet
      Bio::Seq::primary_seq
      Bio::PrimarySeq::alphabet
   Bio::SeqIO::alphabet
      Bio::PrimarySeq::BEGIN
         Bio::Root::RootI::new
            Bio::Root::Root::BEGIN
            Bio::Root::Root::verbose
         Bio::Root::RootI::_rearrange
         Bio::PrimarySeq::alphabet
      Bio::Root::RootI::DESTROY
         Bio::Root::Root::_cleanup_methods
Bio::Seq::DESTROY
Bio::Root::RootI::DESTROY
   Bio::Root::Root::_cleanup_methods
Bio::Root::RootI::DESTROY
   Bio::Root::Root::_cleanup_methods
Bio::Root::RootI::DESTROY
   Bio::Root::Root::_cleanup_methods


There's a fair amount of handling in Bio::Root::Root BEGIN that
involves setting up Error usage; a $DONT_USE_ERROR in my test script
improves things a little (down to 25 seconds instead of 30), but doesn't
change much of the profile percentages.

-Aaron