[Biopython-dev] Ideas for Biopython 2.0

Tiago Antão tiagoantao at gmail.com
Tue Jun 20 15:15:39 UTC 2017

Hi all,

First, my apologies: I was writing from an address not on the mailling
list, so you were not getting my comments (you can find them on the emails
that replied to me). This should now be sorted

Second, I think it would be worthwhile to consider a module architecture
and Bow's proposal could be a good starting point for a discussion

Third, I suggest we start a wiki page on ideas about this.

Below some comments to Patrick's last email:

2. I would prefer to put Biopython 2.0 into a separate package (rather than
> put the subpackage "v2" in the existing package). this way a user has the
> choice to only install Biopython 2.0 without 1.x. If a user wants to use
> both packages, he can install both packages and use them interchangeably,
> since 1.x uses the package name "Bio." and 2.0 uses "biopy"/"biopython".

Separate package indeed.

> 3. Although we plan to drop Python 2.x support not before 2020, we could
> drop it now for Biopython 2.0. Python 2.x user could use Biopython 1.x. I
> think dropping Python 2.0 support could be a great thing for Biopython 2.0
> development, since we do not have to care and test for compatibility.

+1, I think it would make no sense to support Python 2.

> 5. Regarding the documentation, I agree that the API is more important
> than writing a tutorial.

Also, I believe that the best format for the tutorial would be Jupyter

> 6. My proposed import structure won't import anything into the top level
> namespace. For example in case of my structure subpackage
> (padix-key/biopython) you need for example to type "from Bio.structure
> import AtomArray, superimpose" to get the tools to superimpose two
> structures. I think this is more convenient than "from
> Bio.structure.superimpose import superimpose" and "from Bio.structure.atoms
> import AtomArray".

There might be some residual stuff on the top level (say, some exceptions,
or at least an abstract one - among other stuff). Stuff that is shared by
all modules

> 7. I'd still prefer to put modules for biological data files in the
> subpackage "files" (as proposed in the PDF), since this organisation allows
> for putting the base class "File" into a fitting place and it is further
> possible to add modules for files that are not represented by any other
> subpackage (sequence, structure, etc). Therefore I think this approach has
> a better potential for extensibility.

The File class could go on top-level. I think for extensibility we need to
have files inside their modules: remember that we are talking about an
extensible architecture (a la biogems) and external authors might have
their own files. We do not want non-core modules to be monkey patching
biopython.files (or any other core part of the namespace)

> 9. I personally do not use "pandas" and therefore I do not know if it is
> neccessary to add the package to the dependencies, but if this finds
> consensus it is fine for me.

I am more talking about acceptable core dependencies. That does not need to
be a dependency on the original release of biopython 2.0. It is just a list
of packages that would be acceptable if someone wanted to use them on a
future core package.
So if you prepare a core module with a new dependency (say pandas) you know
that that would be not a constraint on accepting the module as core.
Whereas your_obscure_package as a dependency would not be accepted.

> 10. I suggest another dependency: The "requests" package provides utility
> for http requests, which can be useful for usage of online applications as
> well as database requests.

If this idea is accepted we would have to work on a list of packages that
are OK. These would be highly mature, highly used packages in the python
world. And maybe other stuff like pysam or pyvcf (highly mature in our

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20170620/c0cf8235/attachment.html>

More information about the Biopython-dev mailing list