[Biopython-dev] PEP8 lower case module names?

Michiel de Hoon mjldehoon at yahoo.com
Fri Nov 2 16:01:35 UTC 2012

Hi everybody,

--- On Thu, 11/1/12, Eric Talevich <eric.talevich at gmail.com> wrote:
> 1. If we're going to change the API substantially, we might
> as well "do it right". Besides our PEP8 non-compliance, there
> are some dark, dusty corners of Biopython that we ought to clean
> up while we're at it -- reorganize the little historical fiefdoms
> into a coherent structure. We'd call it Biopython 2.


> 2. Observing BioPerl and BioRuby, it could make sense to
> split the distribution into multiple, with a sequence- and
> data-oriented "biopython-core" package and separate packages
> for, say, 3D structures ("biopython-struct") and perhaps other 
> existing components that have ready
> maintainers and which the "core" of Biopython doesn't rely
> on. I don't think we need to fragment the code base much,
> primarily just extract PDB, SCOP and the other parts that
> depend on NumPy.

This goes against the "coherent structure" in point 1. What is the advantage of splitting the distribution according to whether a module needs NumPy or not? I don't see an advantage to the user, and I don't see an advantage to the developers either. Already I feel that we need to install too many packages to get going with Python in bioinformatics (Python itself, NumPy, Matplotlib and its dependencies, Pysam, Cython (needed to compile Pysam), ezsetup, perhaps SciPy, Biopython). I find this hard to explain to people new to bioinformatics or new to Python. So I would prefer to keep one distribution.

We can be more lenient in terms of dependencies, especially those that don't occur at compile time.

> 4. Naming: "bio" is clean but might cause problems on
> Windows? (I wouldn't know, nyah); "bio2" is nearly as clean;
> "biopy" follows the numpy/scipy convention.

Any problems on Windows will only occur during a transition period, so I wouldn't worry about that too much. Perhaps we should check if there would be any problems; if they are severe, we could check for an existing Biopython installation in setup.py.

bio2 would stay with us forever (well at least until bio3) and is just plain ugly, especially to new users who are not aware of the transition. Then there is the issue that "bio2" would not be for Python 2 but for Python 3.

The "py" is needed in numpy and scipy because otherwise it would be "num" and "sci", which is too short. On the other hand, "bio" is used as a prefix in lots of words, and can stand on its own. Therefore, hurray for "bio".

> 5. Porting: I, personally, would keep using the old Biopython for
> everything that's meant to run on Python 2, which is, currently,
> everything. Biopython2 running on Python 3 would give me an
> excuse to start using Python 3 for new code. Keeping these 
> separate would be more difficult if the lowercasing were done
> under the same "Bio" namespace.

Yes that makes sense.


More information about the Biopython-dev mailing list