[Biojava-dev] flatfile indexing woes

Matthew Pocock matthew_pocock@yahoo.co.uk
Wed, 18 Sep 2002 23:49:32 +0100


Hi Keith,

I'll take a look at this. The original indexing mechanism (in seq.io) 
was only intended to work with sequence files. The obda indexing system 
is meant to work for any record-based index file, but we never realy put 
the smarts in for stooring the format type - there is no agreed upon way 
to identify format types across projects. Anyway, I will try to check if 
the obda indexer is working for embl & sp files and send you a progress 
report.

Matthew

Keith James wrote:
> I want to create SequenceDBs from the whole of EMBL and the whole of
> SWALL. I've been looking at at the OBDA biojava indexing
> implementations which came out of the hackathon.
> 
> I could do with some help getting this working. If I run
> demos/indexing/CreateSPIndex on the current version of swall it
> completes with no errors. However, I get the following index
> 
> config.dat:
> index   flat/1
> primary_namespace       ID
> secondary_namespaces    AC
> 
> id_AC.index:
> 21
> 
> key_ID.dat:
> 46
> 
> and that's it. I imagine this is not correct.
> 
> I've just spent 6 hours with a printout of the OBDA spec documenting
> the code, so I'm passing on this for now. There seems to be two (or
> more?) indexing frameworks and we also have two IndexStore interfaces,
> each with different methods.
> 
> I saw a mention of a SleepyCat implementation. Does this exist for
> biojava?
> 
> Keith
> 


-- 
BioJava Consulting LTD - Support and training for BioJava
http://www.biojava.co.uk