[BioSQL-l] Genbank loading time

Richard Holland holland at eaglegenomics.com
Tue Jan 27 22:57:59 UTC 2009


It would depend on the toolkit you use. BioWarehouse is a complete API,
whereas BioSQL is just a schema and the way in which it is populated
(and therefore how long that takes) depends on your toolkit.

Currently I'm aware of loaders existing for BioJava, BioPerl, and
possibly also BioPython. However each of them load the same data in
subtly different ways, so can't be directly compared in terms of which
one is faster than the other.

I vaguely remember seeing some performance figures for the
BioJava/Genbank/BioSQL combination somewhere, but it's been a while! I'm
not sure where they were documented though - I certainly haven't got
them written down anywhere. Mark Schreiber might know as he definitely
did some testing of this - Mark, can you remember what the figures were
for BioJava?

As for BioPerl/BioPython/etc. I expect their respective project authors
will respond to this thread accordingly with the figures from their own
domains!

cheers,
Richard

gwu wrote:
> Hi Everyone,
> 
> I recently visited the BioWarehouse web site and the document shows
> loading the whole Genbank into their database takes the data loader 68
> hours for MySQL, and 27.5 hours for Oracle. So I wonder if there is a
> similar test done with BioSQL?
> 
> Gang Wu
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
> 

-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the BioSQL-l mailing list