[Biopython-dev] Storing Bio.SeqIO.index() offsets in SQLite
Peter
biopython at maubp.freeserve.co.uk
Wed Jun 9 10:55:23 EDT 2010
On Wed, Jun 9, 2010 at 9:55 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>
> Having had a quick look, they are using SQLite3 in much the
> say way as I was initially. They create the index before loading
> (rather than after loading) and they use a single insert per
> offset (rather than using a batch in a transaction or the
> executemany method). I'm pretty sure from my experiments
> those changes would speed up screed's loading time a lot
> (probably inline with the speed up I achieved).
>
Do you fancy trying this version of screed? It seems much
faster on medium sized FASTQ files:-
http://github.com/peterjc/screed/tree/sqlite-tweaks
I'm still running a few tests myself, but will pass this on to
the screed team unless I find some regressions.
Peter
More information about the Biopython-dev
mailing list