[Biopython-dev] PEP8 lower case module names?

Peter Cock p.j.a.cock at googlemail.com
Mon Oct 22 22:59:21 UTC 2012


On Mon, Oct 22, 2012 at 10:53 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>
> Personally, I've used the variable name "seq" an awful lot, so I'm wary of
> using "seq" as a module name. However, reasonable coding style could make
> this easy to avoid if we have a "seq" module containing all of Seq,
> SeqRecord and SeqFeature (maybe even Alphabet), and "sequtil" containing
> standalone functions.
>
> Result:
>
> # Everything you need to build a new sequence record, but not much else
> from Bio.seq import Seq, SeqRecord, SeqFeature

I'd been picturing:

from Bio.seq import Seq
from Bio.seq.record import SeqRecord
from Bio.seq.feature import SeqFeature

but you're right, those three classes could all be exposed at the level
of Bio.seq (while still having the SeqRecord defined in the file
Bio/seq/record.py and SeqFeature etc in Bio/seq/feature.py) for
connivence.

> # Working with sequence strings
> from Bio import sequtil

If you mean strings rather than Seq objects, currently Bio.SeqUtils
should most work on Seq or strings. It is kind of an odds and ends
module, rather than deliberately focusing on sequences as strings.

> It also seems reasonable to treat molecular sequences as the implied core
> object type at the top-level namespace. From that viewpoint, Bio.Search
> would mean sequence search, as everything else is typically tucked away in a
> sub-module like PDB (pdb?), Motif (motif), or Phylo (phylo); then it's also
> fine to keep seqio and alignio directly under the Bio namespace.

Having sequence stuff collected under Bio.Seq or Bio.seq (or bio.seq
if we go with the lower case plan for Python 3) seems more organised.
It also keeps the import times down for people not working with
sequences (e.g. a script using clustering or PDB files).

> (Given a clean, I'd prefer "from Bio import Seq, SeqRecord, SeqFeature", but
> since those are already module names it would be brutal to make that
> transition now.)

That isn't a good plan anyway in terms of polluting the namespace
and loading things into memory for anyone not working with sequences.

>> We can have imports setup so that all the classes etc
>> are only defined once, e.g. Bio/seq/__init__.py could
>> initially just contain 'from Bio.Seq import *' and so on.
>>
>
> Sounds cool. We'll need to watch out for the PDB module, where classes and
> modules have identical names, and the class names are imported to shadow the
> module names at import time.

The shadowing was one of the gotchas in the auto-conversion
of all the module names to lower case - but solvable. Adopting
lower case module names has the bonus of fixing this in the long
term.

Peter



More information about the Biopython-dev mailing list