[Biopython] Parsing GB seq files with BioPython into BioSQL

Tue Mar 26 13:08:26 UTC 2013

Hi,

I am parsing genbank genome files for microbial genomes and loading the
sequence and annotations into a BioSQL database.

The program I have is quite simple (same as given onlinehttp://
biopython.org/wiki/BioSQL#Loading_Sequences_into_a_database).

The issue is that each record when loaded into memory is huge. Some genomes
take up the entire 32 gb ram + 32 gb swap.

Does anyone have suggestions on how to make this process more efficient?

Thanks,
Shyam