[Biopython-dev] PEP8 lower case module names?

Eric Talevich eric.talevich at gmail.com
Mon Oct 22 17:53:55 EDT 2012


On Mon, Oct 22, 2012 at 1:08 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Fri, Sep 28, 2012 at 11:50 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> > On Thu, Sep 20, 2012 at 10:08 AM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >> On Sun, Sep 16, 2012 at 1:34 PM, Peter Cock <p.j.a.cock at googlemail.com>
> wrote:
> >>>>
> >>>> I guess we need to have a little hack with the 2to3 library and
> >>>> try defining our own custom fixer for the imports...
> >>>
> >>> I've made a start at this - the easy part seems to work :)
> >>>
> >>> https://github.com/peterjc/biopython/commits/py3lower
> >>>
> >>> ...
> >
> > The code to do this lower case name mangling remains
> > a quite spaghetti like mess in do2to3.py but it now works
> > enough to pass the test suite (with some but not all 3rd
> > party dependencies installed) under Linux and my Mac
> > OS X machine (where like Windows I have a case
> > insensitive file system).
> >
> > ...
> >
> > So this idea to adopt PEP8 lower case module names
> > as part of supporting Python 3 appears to be technically
> > viable.
>
> Has anyone else tried this branch yet? Has the lower case
> module names under Python 3 idea grown on anyone?
> I think it makes sense in terms of a long term vision - I do
> expect to be primarily working under Python 3 within a
> couple of years.
>
> It occurs to me we can make a partial step in this direction
> with moving to a directory for Bio.Seq, since this could be
> Bio.seq instead. For example, we talked about something
> like this:
>
> Bio.Seq -> Bio.seq
> Bio.SeqRecord -> Bio.seq.record
> Bio.SeqFeature -> Bio.seq.feature
> Bio.SeqUtils -> Bio.seq.utils
> Bio.SearchIO -> Bio.seq.search
>
> I'm not 100% sure where the Bio.SeqIO top level functions
> would belong, either directly under Bio.seq or Bio.seq.record
> might work too.
>


Personally, I've used the variable name "seq" an awful lot, so I'm wary of
using "seq" as a module name. However, reasonable coding style could make
this easy to avoid if we have a "seq" module containing all of Seq,
SeqRecord and SeqFeature (maybe even Alphabet), and "sequtil" containing
standalone functions.

Result:

# Everything you need to build a new sequence record, but not much else
from Bio.seq import Seq, SeqRecord, SeqFeature

# Working with sequence strings
from Bio import sequtil

It also seems reasonable to treat molecular sequences as the implied core
object type at the top-level namespace. From that viewpoint, Bio.Search
would mean sequence search, as everything else is typically tucked away in
a sub-module like PDB (pdb?), Motif (motif), or Phylo (phylo); then it's
also fine to keep seqio and alignio directly under the Bio namespace.

(Given a clean, I'd prefer "from Bio import Seq, SeqRecord, SeqFeature",
but since those are already module names it would be brutal to make that
transition now.)



> We can have imports setup so that all the classes etc
> are only defined once, e.g. Bio/seq/__init__.py could
> initially just contain 'from Bio.Seq import *' and so on.
>
>
Sounds cool. We'll need to watch out for the PDB module, where classes and
modules have identical names, and the class names are imported to shadow
the module names at import time.

-Eric


More information about the Biopython-dev mailing list