[Bioperl-l] get_Stream_by_gi: Memory going up every call.
snaphit at planet.nl
snaphit at planet.nl
Wed Oct 31 10:07:38 UTC 2007
I installed S/SE/SENDU/bioperl-1.5.2_102.tar.gz using cpan -force -notest.
And now running the script doesn't use that much memory. Although it keeps going up, but just a little bit every call. For 14000 +/-20mb (starts at 16mb).
Before I used this bioperl version (only one I was able to install using PPM):
Bioinformatics Toolkit 1.5.2 RC2
Version: 1.5.1.9992
Released: 2006-10-02
I first replaced the modules you mentioned, but it didn't help. So I thought before I report the bug, I will try to install the package with cpan. It gave some errors at first, so I used force and notest. And now it seems to work. I get the output I expect and a lot less memory usage.
So if you don't see anything on your windows machine, I won't report the bug.
Thanks,
Jelle
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Tue 10/30/2007 4:33 PM
To: snaphit at planet.nl
Cc: Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
It appears to be Win32-specific; I retrieved ~1000 ESTs from Solanum
lycopersicum and retrieved them in this loop w/o appreciable memory
loss on my MacBook; I'll try WinXP later today. If you can, also try
downloading the Bio::DB::GenBank/NCBIHelper/WebDBSeqI modules from
CVS and replacing your local copies with those to see if the problem
still persists.
chris
On Oct 30, 2007, at 9:56 AM, <snaphit at planet.nl> wrote:
> I just made a test script which shows the problem.
> The second while loop will cause the program to use more and more
> memory without releasing it.
>
> I will post the bug later today.
>
> Jelle
>
> code:
> use Bio::DB::GenBank;
> use strict;
> use warnings;
> my @arraylist = (157043286,157043285,157043189); #use couple of
> hundreds gi's to see the issue
> while (my @small_list = splice(@arraylist, 0, 100)) {
> my $gb = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
> my $stream_obj = $gb->get_Stream_by_gi(\@small_list);
> while (my $seq_obj = $stream_obj->next_seq) {
> #this is what causes the problem...
> }
> }
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Tue 10/30/2007 1:10 PM
> To: snaphit at planet.nl
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every call.
>
> What happens if you create a new Bio::DB::GenBank instance each time
> within streamQuery() instead of using a cached instance? Does the
> memory issue go away?
>
> I'll try to look into it when I can; if you could file a bug on this
> it would help track it:
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Oct 30, 2007, at 3:51 AM, <snaphit at planet.nl> wrote:
>
> > This doesn't seem to work. It still keeps using more memory. I
> > already tried it once and didn't seem to make a difference. But
> > gave it another try. But as I said, it doesn't solve the problem.
> >
> >
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Mon 10/29/2007 5:06 PM
> > To: snaphit at planet.nl
> > Cc: Bioperl-l at lists.open-bio.org
> > Subject: Re: [Bioperl-l] get_Stream_by_gi: Memory going up every
> call.
> >
> > It may be based on the mode (set by -retrievaltype) in which the
> > sequences are being retrieved and parsed. The Bio::DB::WebDBSeqI
> > module has the documentation for this parameter. If you are making
> > tons of calls to get_Seq*/get_Stream* methods it may lead to
> > substantial increases in memory until the child processes finish up
> > parsing each data stream.
> >
> > You can possibly add in a wait() in between sequence retrieval
> calls,
> > or try setting the Bio::DB::GenBank instance to 'tempfile' or
> > 'io_string' (the former always worked faster for me):
> >
> > $self->{gb} = Bio::DB::GenBank->new(-retrievaltype => 'tempfile');
> >
> > chris
> >
> > On Oct 26, 2007, at 7:42 AM, Jelle86 wrote:
> >
> > > Ok I stripped a lot. And this is causing the problem:
> > >
> > > use Bio::DB::Genbank;
> > > sub new(){
> > > my $invocant = shift;
> > > my $class = ref($invocant) || $invocant;
> > > my $self = {@_};
> > > $self->{gb} = Bio::DB::GenBank->new();
> > > bless $self, $class;
> > > return $self;
> > > }
> > >
> > > sub streamQuery(){
> > > my $self = shift;
> > > my $stream_obj = $self->{gb}->get_Stream_by_gi($self->
> {ids});
> > > while (my $seq_obj = $stream_obj->next_seq) {
> > >
> > > }
> > > }
> > >
> > > Both subs (new and streamQuery) are called several times with a
> new
> > > accessionlist.
> > > Removing the while loop, will use a bit less memory. But the
> memory
> > > usage is
> > > still going up.
> > > --
> > > View this message in context: http://www.nabble.com/
> get_Stream_by_gi
> > > %3A-Memory-going-up-every-call.-tf4689188.html#a13426480
> > > Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list