[Biopython-dev] PEP8 lower case module names?

Eric Talevich eric.talevich at gmail.com
Fri Nov 2 02:47:56 UTC 2012

On Thu, Nov 1, 2012 at 2:46 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, Nov 1, 2012 at 6:10 PM, Eric Talevich <eric.talevich at gmail.com>
> wrote:
> > 2. Observing BioPerl and BioRuby, it could make sense to split the
> > distribution into multiple, with a sequence- and data-oriented
> > "biopython-core" package and separate packages for, say, 3D structures
> > ("biopython-struct") and perhaps other existing components that have
> ready
> > maintainers and which the "core" of Biopython doesn't rely on. I don't
> think
> > we need to fragment the code base much, primarily just extract PDB, SCOP
> and
> > the other parts that depend on NumPy. On GitHub, these repositories would
> > still be under the biopython organization name.
> A clearer divide would be good - something we have at some level
> already along the lines with and without numpy. However, given
> the still unclear future for python packaging I'm not quite so sure
> if we can/should go all the way to separate packages. Perhaps I
> am being unduly worried by the concerns in the numpy/scipy
> community? After all, we have no fortran code!

My own use of packaging features and setuptools in particular is pretty
primitive, so I'm not sure what the risks are.

Having a separate repository for structure-related code would make it much
easier for me and João to hack on a Bio.PDB successor, I think. It would
also be nice to have a dependency-free "core" and then a bit more
flexibility in using dependencies for add-on packages -- there are a lot of
good existing libraries for structural biology, for instance, and since
performance is so important there we even might want to start using Cython
for some of that code. Then there's Lenna's pure-Python mmCIF parser which
depends on PLY.

> > 5. Porting: I, personally, would keep using the old Biopython for
> everything
> > that's meant to run on Python 2, which is, currently, everything.
> Biopython2
> > running on Python 3 would give me an excuse to start using Python 3 for
> new
> > code. Keeping these separate would be more difficult if the lowercasing
> were
> > done under the same "Bio" namespace.
> >
> > Thoughts?
> As noted above, I'm on board with planning a Biopython 2 requiring Python 3
> or later. I would regard this as effectively be forking from the current
> code
> base, porting individual modules on a case by case basis (doing a final
> 2to3
> conversion manually as part of this). The code could be shared as a series
> of 'alpha' level releases for early testing - assume we want to make some
> releases, particularly for Windows where fewer potential testers would
> have all the compilers setup to follow the repository.
Sounds good to me.

> However, if we do that, we would still support Biopython 1.xx under
> Python 3 as well (via 2to3 as we are now, currently 'beta' level support)
> for some time in parallel (although likely not getting major new features -
> just bug fixes and if required updates for format changes).
Sure. I'm assuming it will be some time before we have a Biopython2 we're
happy with, sorting out the module organization, dusting off old code,
dealing with module-specific dependencies and so on, and I'm OK with that.

> Is there enough enthusiasm now to start planning what we'd change for
> a (potentially Python 3 only) Biopython 2 yet?
> Peter

Maybe a good time to create the initial fork would be after we've merged
the latest GSoC work and any feasible long-running branches. The
Bio.PDB-related GSoC work, on the other hand, seems to be held up
specifically because we're afraid to muck with the existing sub-package too
much with unstable new code, and I can imagine it would be easier to land
it in a new namespace.


More information about the Biopython-dev mailing list