[BioPython] versioning, distribution and webdav

Andrew Dalke dalke@acm.org
Tue, 15 May 2001 04:17:30 -0600


Ewan:
>Lots of us have been thinking about lots of ways of solving this.

I figured, but there's too much going on even in just this part
of the world.  (Don't know if to put a smiley or frowny.)

>I wrote a CORBA caching scheme around BioCorba and Bioperl-db, which means
>for each sequence you bring it local only once.

Yep, saw mention of that this evening as I scanned the bioperl archives
for this month.  Didn't catch if it gave the full record as text, the
semantic portion of the record or just the subset of the record that
bioperl understands.  I'm interested in the first - passing all
information including the syntax - while I know bioperl concentrates
more on just the semantic meaning and not the presentation.

>However CORBA *will not scale* for shipping EMBL/Swissprot around in a
>nice "only send updates" mode.

That too.

> FreeNet stuff (I'm looking at this as is lincoln stein)

I'm an old timer - I still think of freenets as free shell accounts.  :)

In what little I understood I thought they were having scalability
problems because of the full P2P model.  For this task that sharing
model isn't needed since there is only one or a small number of
primary servers and the databases don't need a distributed search
mechanism.  For Lincoln's distributed annotation&search project, it
is more appropriate.  I think they are different but related tasks.

>I'd love to know more about WebDAV - is there a spec/good starting page
>somewhere?

www.webdav.org

Mind you, I haven't evaluated any of the code or usefulness.  It was
just brought to mind because of a mention in Greg Stein's Advogato
entry (http://www.advogato.org/person/gstein/) about having
Subversion - a cvs-like system built on top of WebDAV - be able to
handle the 49 GB(!) of CVS repository.  Meaning it's supposed to be
able to handle data sets that large, at least in theory.

                    Andrew
                    dalke@acm.org