[Biopython-dev] Questions about code contributions

Rob Knight rob at spot.colorado.edu
Thu Feb 20 11:12:24 EST 2003


We are setting up a fairly large database project here at CU Boulder (5-10
developers over the next 3 years), and have settled on Python, PostgreSQL
and Zope as our primary development tools. The main goal of the project is
to automate comparative analyses across many disparate taxa, and to
incorporate multiple types of expression and structural data in a
phylogenetic context. Unfortunately, this focus on phylogeny and
expression means that the existing BioSQL project doesn't really meet our

I am currently deciding whether we should use the Biopython code base. As
noted at the root of the API documentation, the existing code needs to be
cleaned up extensively. Also, there seems to have been very little
activity towards handling phylogeny, mass spec data, and RNA structure,
which are three areas that are critical to us. On the other hand, there
are many useful modules in the Biopython code, such as the GA and Graphics
modules and the parsing framework, that we would like to use and extend.
However, a lot of the code examples in the Biopython Tutorial/Cookbook
seem not to work or are very brittle due to bugs in the underlying code
(these may have been fixed in the cvs version -- I haven't had time to

If we do use Biopython, we would definitely be interested in returning all
our contributions to the community. However, it would only be worth the
time it would take for us to do this (as opposed to starting fresh) if
it's possible for us to reorganize the code significantly.

So, the main questions I have are:

1. What is the process, if any, for suggesting and/or making large-scale
   changes that are not compatible with existing code (e.g. changing the
   module structure, changing inheritance patterns, introducing and using
   new top-level abstract data types)? How much support would there be for
   doing a significant reorganization for, say, a 2.0 release in 2004?
   From Jeff Chang's message to the list earlier today, I get the
   impression that this is already in progress, at least for the parsers.

2. To what extent is the current code base compatible with Jython? Is
   there any general interest in using Biopython with Jython? (This is
   important to us, since we have a Java framework for distributing
   tasks across a cluster, and we may also want to integrate with
   the Mesquite phylogeny package later on).

3. How many developers are currently actively working on Biopython? (In
   other words, how much can we expect it will benefit us to participate
   in Biopython rather than just writing things on our own?)

I'll definitely appreciate any thoughts you have: I'd certainly like to
contribute to the Biopython project, but need to check that it will make
sense for us to integrate our efforts.


Rob Knight

MCD Biology
University of Colorado, Boulder

More information about the Biopython-dev mailing list