[Biopython] Parsing GB seq files with BioPython into BioSQL

Shyam Saladi saladi at caltech.edu
Tue Mar 26 13:08:26 UTC 2013


Hi,

I am parsing genbank genome files for microbial genomes and loading the
sequence and annotations into a BioSQL database.

The program I have is quite simple (same as given onlinehttp://
biopython.org/wiki/BioSQL#Loading_Sequences_into_a_database).

The issue is that each record when loaded into memory is huge. Some genomes
take up the entire 32 gb ram + 32 gb swap.

Does anyone have suggestions on how to make this process more efficient?

Thanks,
Shyam



More information about the Biopython mailing list