[Bioperl-l] Out of memory errors running Bio::ASN1::EntrezGeneagainst latest Homo_sapiens.ags file

Susan Wilson smwilson at hpc.unm.edu
Mon Oct 15 11:08:55 EDT 2007


Mingyi,

Thank you very much for your advice.  The text ASN file 1/4 the size  
of the (evil, evil) XML file and parsing it ran just fine.  We are  
still pursuing a 64-bit perl on our 256GB server and I will let you  
know how it works.

Thanks.
Susan



On Oct 12, 2007, at 1:06 PM, Mingyi Liu wrote:

> BTW, here's the syntax in one of my messages last year about how to  
> convert the compressed binary ASN format NCBI provides to the text  
> ASN format my module (or Stefan's SeqIO::entrezgene) expects (the - 
> x switch does the trick, overwriting the default option to produce  
> XML output):
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i  
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the  
> gzipped binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus  
> SeqIO::entrezgene).
>
> BTW, text ASN is both smaller and faster to parse than XML format.
>
> Best,
>
> Mingyi




More information about the Bioperl-l mailing list