[Bioperl-l] Memory not sufficient when storing human chromosom 1 in BioSQL

Torsten Seemann torsten.seemann at infotech.monash.edu.au
Fri Jul 4 06:54:29 UTC 2008


>> (ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_01/hs_ref_chr1.gbk.gz) I
>> receive the error message 'Out of memory'. This takes about one hour. My

> Have you tried just loading the sequence into memory using Bio::SeqIO?  The
> problem may be the size of the file itself.

# gunzip -l hs_ref_chr1.gbk.gz
         compressed        uncompressed  ratio uncompressed_name
           95180772           296770940  67.9% hs_ref_chr1.gbk
 # zgrep -c ^LOCUS  hs_ref_chr1.gbk.gz
49
# zgrep -c -i '^     [A-Z-]'  hs_ref_chr1.gbk.gz
25853

It's about 300 MB of ASCII in 49 records and 26,000 features.

Not sure what that would explode to if wrapped in Bio::SeqFeature-like
objects? Factor of 4 perhaps?

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Dept. Microbiology, Monash University



More information about the Bioperl-l mailing list