[Bioperl-l] get_Stream_by_gi: Memory going up every call.

snaphit at planet.nl snaphit at planet.nl
Wed Oct 31 10:07:38 UTC 2007


I installed S/SE/SENDU/bioperl-1.5.2_102.tar.gz using cpan -force -notest.
And now running the script doesn't use that much memory. Although it keeps going up, but just a little bit every call. For 14000 +/-20mb (starts at 16mb).

Before I used this bioperl version (only one I was able to install using PPM):
Bioinformatics Toolkit 1.5.2 RC2
Version:	1.5.1.9992
Released:	2006-10-02

I first replaced the modules you mentioned, but it didn't help. So I thought before I report the bug, I will try to install the package with cpan. It gave some errors at first, so I used force and notest. And now it seems to work. I get the output I expect and a lot less memory usage. 

So if you don't see anything on your windows machine, I won't report the bug.

Thanks,
Jelle



-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Tue 10/30/2007 4:33 PM
To: snaphit at planet.nl
Cc: Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
 
It appears to be Win32-specific; I retrieved ~1000 ESTs from Solanum  
lycopersicum and retrieved them in this loop w/o appreciable memory  
loss on my MacBook; I'll try WinXP later today.  If you can, also try  
downloading the Bio::DB::GenBank/NCBIHelper/WebDBSeqI modules from  
CVS and replacing your local copies with those to see if the problem  
still persists.

chris

On Oct 30, 2007, at 9:56 AM, <snaphit at planet.nl> wrote:
> I just made a test script which shows the problem.
> The second while loop will cause the program to use more and more  
> memory without releasing it.
>
> I will post the bug later today.
>
> Jelle
>
> code:
> use Bio::DB::GenBank;
> use strict;
> use warnings;
> my @arraylist = (157043286,157043285,157043189); #use couple of  
> hundreds gi's to see the issue
> while (my @small_list = splice(@arraylist, 0, 100)) {
>         my $gb = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
>         my $stream_obj =  $gb->get_Stream_by_gi(\@small_list);
>         while (my $seq_obj = $stream_obj->next_seq) {
>             #this is what causes the problem...
>         }
> }
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Tue 10/30/2007 1:10 PM
> To: snaphit at planet.nl
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
>
> What happens if you create a new Bio::DB::GenBank instance each time
> within streamQuery() instead of using a cached instance?  Does the
> memory issue go away?
>
> I'll try to look into it when I can; if you could file a bug on this
> it would help track it:
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Oct 30, 2007, at 3:51 AM, <snaphit at planet.nl> wrote:
>
> > This doesn't seem to work. It still keeps using more memory. I
> > already tried it once and didn't seem to make a difference. But
> > gave it another try. But as I said, it doesn't solve the problem.
> >
> >
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Mon 10/29/2007 5:06 PM
> > To: snaphit at planet.nl
> > Cc: Bioperl-l at lists.open-bio.org
> > Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every  
> call.
> >
> > It may be based on the mode (set by -retrievaltype) in which the
> > sequences are being retrieved and parsed.  The Bio::DB::WebDBSeqI
> > module has the documentation for this parameter.  If you are making
> > tons of calls to get_Seq*/get_Stream* methods it may lead to
> > substantial increases in memory until the child processes finish up
> > parsing each data stream.
> >
> > You can possibly add in a wait() in between sequence retrieval  
> calls,
> > or try setting the Bio::DB::GenBank instance to 'tempfile' or
> > 'io_string' (the former always worked faster for me):
> >
> > $self->{gb} = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
> >
> > chris
> >
> > On Oct 26, 2007, at 7:42 AM, Jelle86 wrote:
> >
> > > Ok I stripped a lot. And this is causing the problem:
> > >
> > > use Bio::DB::Genbank;
> > > sub new(){
> > >   my $invocant = shift;
> > >   my $class = ref($invocant) || $invocant;
> > >   my $self = {@_};
> > >   $self->{gb} = Bio::DB::GenBank->new();
> > >   bless $self, $class;
> > >   return $self;
> > > }
> > >
> > > sub streamQuery(){
> > >       my $self = shift;
> > >       my $stream_obj = $self->{gb}->get_Stream_by_gi($self-> 
> {ids});
> > >         while (my $seq_obj = $stream_obj->next_seq) {
> > >
> > >         }
> > > }
> > >
> > > Both subs (new and streamQuery) are called several times with a  
> new
> > > accessionlist.
> > > Removing the while loop, will use a bit less memory. But the  
> memory
> > > usage is
> > > still going up.
> > > --
> > > View this message in context: http://www.nabble.com/ 
> get_Stream_by_gi
> > > %3A-Memory-going-up-every-call.-tf4689188.html#a13426480
> > > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign








More information about the Bioperl-l mailing list