[Bioperl-l] Bio::Index::Fastq - Interface for indexing (multiple) fastq files failure

Wed Apr 7 18:09:11 UTC 2010

On Wed, Apr 7, 2010 at 6:56 PM, Chris Fields <cjfields at illinois.edu> wrote:
>Peter wrote:
>> Hi Chris,
>>
>> Does Mark's SQLite_DBM already have an SQLite schema defined? I'd
>> idealy like us to agree something shared with other Bio* libraries (a new
>> OBDA standard using SQLite instead of BDB). I was thinking something
>> along these lines if we want to support an index for multiple files:
>>
>> * meta - table with string key/values (in particular to hold a schema version
>> number, plus perhaps the tool which built the index)
>>
>> * offsets - table with entry accessions, file number, file offset
>>
>> * files - table with filenames, file type (e.g. FASTA), datestamp
>> (so we can spot if the index is older than the file and needs to be
>> updated), perhaps other things like if the file is compressed (gzip,
>> bz2, ...).
>>
>> If some kind of shared SQLite index schema (whatever it looks like)
>> does seem like a good idea to you guys (BioPerl), should we move
>> this discussion over to open-bio-l at lists.open-bio.org?
>>
>> Regards,
>>
>> Peter
>
> I think this is a good idea for ODBA-based modules, but Bio::Index::*
> modules aren't ODBA-compliant (at least that I know of); it's a simple
> key-value hash pairing I believe.  Bio::Flat* are ODBA-compliant,
> though, so it's worth exploring this.
>
> chris

I didn't appreciate BioPerl had more than one indexing back end
(Bio::Index versus Bio::Flat).

Peter