[Biopython-dev] Ideas for Biopython 2.0

Tue Jun 20 07:58:46 UTC 2017

Dear Patrick, Peter, Tiago, and everyone,

Good to see this discussion being brought up again :) (and thank you
Patrick for writing a thorough proposal).

On the big picture, I wholeheartedly agree that Bioptyhon 2.0 should adopt
modern Python best practices, even if it requires a close to full rewrite.
The Python scientific computing environment has progressed far beyond the
time when Biopython was first written. Many scientific Python libraries are
now mature and widely used that at least the thought of interop with them
should be considered. This can be done, for example, making our data
structures compatible with them (e.g. using numpy arrays) or by making our
plotting functions (in Bio.Graphics, for example) compatible with them
(e.g. using matplotlib or bokeh).

I don't have any strong preference for the new namespace (biopy.* or
biopython.* is fine). But depending on how we structure the new modules, we
may or may not need to share the namespace with the current Bio package,
right? If we opt to do modularization ala biogems, only the core module is
probably suitable for inclusion in the current Bio package. The rest would
have their own repositories.

I did play around with a distributed package setup (
https://github.com/bow/poc_biopy). There are two alternatives that I
considered there. The first one, `poc_hook` uses an import hook so any
non-core `biopy_*` package can be imported as `biopy.ext.*`. The second,
`poc_pkgutil` one simply requires any non-core `biopy_*` package put their
code inside `biopy.ext`. This was from about 3 years ago, however, so there
may be better ways of doing this now.

But as Peter said, this is probably the one on which a consensus is hardest
to build. In addition to that, we would need work to port existing packages
to the new structure.

Now seems like a good time to attempt to do this, though :).

Cheers,
Bow

On Mon, Jun 19, 2017 at 10:43 PM Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> Yep - that's what I meant :)
>
> i.e. "biopython" (or "biopy" or ...") as a folder name meaning we'd
> have for example "biopython/sequences/__init__.py`` which can
> be import as "from biopython import sequences" etc.
>
> The NumPy/SciPy like usage pattern for importing had crossed my
> mind too - although if we try to minimise the top level automatic
> imports I think that is less useful?
>
> (By that I mean that for example people doing clustering would not
> want the overhead of lots of sequence code being imported by default)
>
> Peter
>
> On Mon, Jun 19, 2017 at 3:47 PM, Tiago Antão <tiagoantao at gmail.com> wrote:
> > Or even
> > biopython.*
> > and have a recommendation of
> > import biopython as bp
> > Tiago
> >
> > On 19 June 2017 at 08:45, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> >>
> >> I am generally in agreement with your comments Tiago.
> >>
> >> Note as per my reply to Patrick, we can't use "bio" (lower case)
> >> as this would be the same directory on disk as "Bio" (title case) on
> >> Windows and most Macs which use a case-insensitive file system.
> >> Thus suggestions like "biopy" and "biopython" instead.
> >>
> >> Peter
> >>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20170620/4e0396f8/attachment.html>