[BioSQL-l] [Bioperl-l] postgres 8.3 - load_seqdatabase.pl / swissprot
Hilmar Lapp
hlapp at gmx.net
Sat Mar 22 20:01:51 UTC 2008
Forgot to respond to this:
On Mar 21, 2008, at 5:43 PM, Erik wrote:
> It took two hours to load 26504 records (7%) of uniprot_sprot.dat
> (is it expected to be so slow?)
The last time I used to load those regularly it was a bit faster (~ 5
seqs/s) but it is in a ballpark that wouldn't raise a red flag for me.
BTW you can make it print statistics using the --logchunk N option,
where N is the number of seqs after which you want the current count
and the #recs/s printed.
You may get it to be faster if you tune the database (e.g., make sure
there is enough memory for index reorganization, transaction log and
tablespace datafile are on separate disks, etc; fiddling with the
query optimizer has probably little effect as almost all queries are
simple lookups or inserts).
That all said, the strength of load_seqdatabase.pl isn't speed. It
doesn't make use of any bulk upload optimizations, and therefore the
initial load of a very large database will take its time. The power
is more in subsequent updates where you can configure what you want
to happen, and during which the database is never in an inconsistent
state, so it can run in the background.
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the BioSQL-l
mailing list