[Biojava-dev] flatfile indexing woes
Matthew Pocock
matthew_pocock@yahoo.co.uk
Wed, 18 Sep 2002 23:49:32 +0100
Hi Keith,
I'll take a look at this. The original indexing mechanism (in seq.io)
was only intended to work with sequence files. The obda indexing system
is meant to work for any record-based index file, but we never realy put
the smarts in for stooring the format type - there is no agreed upon way
to identify format types across projects. Anyway, I will try to check if
the obda indexer is working for embl & sp files and send you a progress
report.
Matthew
Keith James wrote:
> I want to create SequenceDBs from the whole of EMBL and the whole of
> SWALL. I've been looking at at the OBDA biojava indexing
> implementations which came out of the hackathon.
>
> I could do with some help getting this working. If I run
> demos/indexing/CreateSPIndex on the current version of swall it
> completes with no errors. However, I get the following index
>
> config.dat:
> index flat/1
> primary_namespace ID
> secondary_namespaces AC
>
> id_AC.index:
> 21
>
> key_ID.dat:
> 46
>
> and that's it. I imagine this is not correct.
>
> I've just spent 6 hours with a printout of the OBDA spec documenting
> the code, so I'm passing on this for now. There seems to be two (or
> more?) indexing frameworks and we also have two IndexStore interfaces,
> each with different methods.
>
> I saw a mention of a SleepyCat implementation. Does this exist for
> biojava?
>
> Keith
>
--
BioJava Consulting LTD - Support and training for BioJava
http://www.biojava.co.uk