[Bioperl-l] Memory Leak in Bio::SearchIO

Chris Fields cjfields at uiuc.edu
Tue May 16 21:15:10 UTC 2006


I mentioned two possibilities last time I posted: 1) that the BLAST file was
too large, or 2) that you are using an old version of bioperl that SearchIO
is broken.  You seem to fit #2. 

The issue is that NCBI does not consider text BLAST output sacrosanct and
routinely makes changes to it that break parsing.  Due to this,
SearchIO::blast needs to be constantly updated, so much so that there are
normally a few updates a year to fix parsing issues in that module alone
compared to BioPerl as a whole.  And, BTW, although bioperl-1.4 is about 2
years old now, even bioperl-1.5.1 SearchIO is broken when it comes to the
latest NCBI BLAST (2.2.14 now).  I seriously suggest updating your local
bioperl distribution to the latest bioperl-live (from CVS).

Take one of those 10000 reports, just one, and try parsing it.  If you have
the same problem (a CPU spike and increasing memory usage) then it may be
fixed in CVS.

Chris

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Clarke, Wayne
> Sent: Tuesday, May 16, 2006 3:57 PM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Memory Leak in Bio::SearchIO
> 
> 
> With regards to the suggestions/comments made thank you. However I think
> I should clear a few things up. I am running bioperl v1.4, I am cycling
> through the blast reports which should not be of absurd size since they
> only contain the top 5 hits, and I am using top to track(although I
> realize fairly inacuately) the memory usage. I have looked through the
> code for both AAFCBLAST and BEAST_UPDATE but do not believe the
> leak/problem to be contained within them since they are almost
> exclusively using method calls and those variables should be destroyed
> upon leaving the scope of the method. I have used Devel::Size to check
> the size of the variables $bdbi and $searchio and $connector and on each
> iteration these variables have the same size. Any other suggestions
> would be greatly appreciated as I have nearly gone insane trying to
> track this problem down.
> 
> Thanks, Wayne
> 
> 
> -----Original Message-----
> From: Torsten Seemann [mailto:torsten.seemann at infotech.monash.edu.au]
> Sent: Monday, May 15, 2006 6:19 PM
> To: Clarke, Wayne
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Memory Leak in Bio::SearchIO
> 
> > taking up and huge amount of RAM. For a single job of 10000 queries it
> > can consume as much as a couple hundred Mb inside an hour. I realize
> 
> >  my $result = $connector->getQueryResult($query_id);
> >                 my $searchio = new Bio::SearchIO(-format => "blast",
> >                 while (my $o_blast = $searchio->next_result()) {
> >                         my $clone_id = $o_blast->query_name();
> >                         my $statement = $bdbi->form_push_SQL
> ($o_blast, $clone_id, 5); }
> 
> Some comments:
> 
> Have you considered that whatever class/module $bdbi belongs to is
> causing the problem? ie. is it keeping a reference to $o_blast around?
> 
> Are you aware that Perl garbage collection does not necessarily return
> freed memory back to the OS? This may affect how you were measuring
> "memory usage".
> 
> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list