[Biopython-dev] Storing Bio.SeqIO.index() offsets in SQLite
Brent Pedersen
bpederse at gmail.com
Wed Jun 9 15:56:27 UTC 2010
On Wed, Jun 9, 2010 at 7:55 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Wed, Jun 9, 2010 at 9:55 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>>
>> Having had a quick look, they are using SQLite3 in much the
>> say way as I was initially. They create the index before loading
>> (rather than after loading) and they use a single insert per
>> offset (rather than using a batch in a transaction or the
>> executemany method). I'm pretty sure from my experiments
>> those changes would speed up screed's loading time a lot
>> (probably inline with the speed up I achieved).
>>
>
> Do you fancy trying this version of screed? It seems much
> faster on medium sized FASTQ files:-
>
> http://github.com/peterjc/screed/tree/sqlite-tweaks
>
> I'm still running a few tests myself, but will pass this on to
> the screed team unless I find some regressions.
>
> Peter
>
not too much difference.
screed
------
create: 666.381
search: 51.839
More information about the Biopython-dev
mailing list