[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGeneagainst latest Homo_sapiens.ags file

Joel Martin j_martin at lbl.gov
Fri Oct 12 18:58:41 UTC 2007


Hello,
    Just a suggestion, is /usr/bin/perl a 64 bit perl?
Even on our sun machines with 72+ GB memory, for some reason they're
distributed with a 32 bit perl which can handle large files but would
probably have out of memory errrors if trying to read one into memory.

% file /usr/bin/perl 
/usr/bin/perl:  ELF 32-bit MSB executable SPARC Version 1, dynamically
linked, stripped

Joel

On Fri, Oct 12, 2007 at 02:34:49PM -0400, Stefan Kirov wrote:
> Kevin Brown wrote:
> >> I downloaded the latest ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
> >> ASN_BINARY/Mammalia/Homo_sapiens.ags.gz and ran gene2xml on 
> >> it to generate Homo_sapiens.xml which is 5821420628 bytes.  I 
> >> cannot parse this file with Bio::ASN1::EntrezGene, even on a 
> >> machine with 256GB of memory.  I get a simple "Out of memory" 
> >> output even with the following code:
> >>
> >> #!/usr/bin/perl
> >> use strict;
> >> use Bio::ASN1::EntrezGene;
> >>    my $parser = Bio::ASN1::EntrezGene->new('file' => 
> >> "Homo_sapiens.xml");
> >>    while(my $result = $parser->next_seq)
> >>    {
> >>    }
> >>     
> >
> > I think most systems have a per process memory limit (either hardcoded
> > in the OS or configured depending on the OS) and IIRC most of the IO
> > handlers for BioPerl load entire file contents into memory to process
> > them.  Some of the IO parsers have been changed recently (a new one
> > added for blast) so that it only pulls into memory as much as it needs
> > to process the next result rather than the whole file in one shebang.
> >   
> The file is approx. 6GB, so on a 256GB machine this is not going to
> create any problem. I think this might be deep not well controlled
> recursion problem.
> Stefan
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >   
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list