[BioPython] Re: BLAST result persistence

Andrew Dalke dalke@acm.org
Sat, 2 Jun 2001 21:55:09 -0600


ISHIKAWA, Masahiro <misi@nias.affrc.go.jp>:
>However, in my case, an output file probably contains
>results for many number of queries, and it will be more
>than several hundreds mega bytes in size.
  ...
>Maybe the whole output plain file cannot be loaded into
>memory at a time.
>Thus I need some trick to achieve efficient random
>access to individual result stored on a disk.

One thing you might want to look at is ZODB, which is
part of the Zope package.  (There's a version called
StandaloneZODB which is in beta and only available from
Digital Creation's CVS.)

ZODB is a persistent store for Python data structures.
It acts as if all the data is in memory but underneath
it really stores things on disk, with a memory cache so
performance doesn't suffer.

The only requirement for Zope is that all the data be
pickleable, and that you do a few special things to
tell it when some data types have been changed.

I've been using it for one project and it has been very
useful and very solid - I've never had the database
crash on me.

The biggest problem is the lack of documentation.
Andrew Kuchling has written up a developer's document,
and his site is a good place for all things ZODB, so
if ZODB sounds appropriate you might want to look at
  http://amk.ca/zodb/

                    Andrew
                    dalke@acm.org