[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGeneagainst latest Homo_sapiens.ags file
Stefan Kirov
stefan.kirov at bms.com
Fri Oct 12 14:34:49 EDT 2007
Kevin Brown wrote:
>> I downloaded the latest ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/
>> ASN_BINARY/Mammalia/Homo_sapiens.ags.gz and ran gene2xml on
>> it to generate Homo_sapiens.xml which is 5821420628 bytes. I
>> cannot parse this file with Bio::ASN1::EntrezGene, even on a
>> machine with 256GB of memory. I get a simple "Out of memory"
>> output even with the following code:
>>
>> #!/usr/bin/perl
>> use strict;
>> use Bio::ASN1::EntrezGene;
>> my $parser = Bio::ASN1::EntrezGene->new('file' =>
>> "Homo_sapiens.xml");
>> while(my $result = $parser->next_seq)
>> {
>> }
>>
>
> I think most systems have a per process memory limit (either hardcoded
> in the OS or configured depending on the OS) and IIRC most of the IO
> handlers for BioPerl load entire file contents into memory to process
> them. Some of the IO parsers have been changed recently (a new one
> added for blast) so that it only pulls into memory as much as it needs
> to process the next result rather than the whole file in one shebang.
>
The file is approx. 6GB, so on a 256GB machine this is not going to
create any problem. I think this might be deep not well controlled
recursion problem.
Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list