[BioSQL-l] trouble cramming genbank

Hilmar Lapp hlapp@gnf.org
Tue, 29 Oct 2002 12:56:52 -0800


Which version of mysql and which version of biosql/bioperl-db are you using?

It sounds like re-organizing the indexes may take a lot of the time (your seqfeature table should be about 5-10x bigger even, let alone seqfeature_location and the qualifier association tables ...). You could drop certain indexes and recreate them after your upload finished. The consequences of doing this depend on the loading code - if there are look-ups or if something relies on UK failures being thrown, then you at least should keep the UKs. I can tell you that with the main trunk version of bioperl-db you are going to need all UKs, but you may do fine without other keys. (If you dropped indexes that are being used for look-ups the sympton you'll see is probably an even more dramatic slowdown.) With the main trunk version of biosql, you could also drop all the FK constraints and re-create them after loading finished.

	-hilmar

> -----Original Message-----
> From: Josiah Altschuler [mailto:jaltschuler@CGR.Harvard.edu]
> Sent: Tuesday, October 29, 2002 9:42 AM
> To: biosql-l@open-bio.org
> Subject: [BioSQL-l] trouble cramming genbank
> 
> 
> Hi, I have a MySQL bioentry row count of around 14000000 now, and I am
> continuing to insert genbank entries using 
> load_seqdatabase.pl.  However, it
> has slowed down drastically to insertion of only two or three
> entries/second.  I was wondering if anyone has experienced 
> this, and could
> give some advice on how to speed things up?
> 
> Thank you,
> Josiah
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l@open-bio.org
> http://open-bio.org/mailman/listinfo/biosql-l
>