[Biopython-dev] Ideas for Biopython 2.0

Patrick Kunzmann padix.kleber at gmail.com
Tue Jun 20 13:13:13 UTC 2017


Hello folks,

thank you for your enthusiasm on the Biopython 2.0 idea. Unfortunately I 
won't be present on the pre-BOSC Codefest to discuss this topic in person.

1. I would prefer the package name "biopython" since this is the 
official project name. But I would refrain from recommending "import 
biopython as bp" since this would require modules directly in the 
package top level, which I am against due to the danger of crowding 
modules in the top package. I also think its better to put the Biopython 
2.0 into a fresh repo, since git is not properly able to track the 
process of the drastic reorganisation. Of course, the license and 
contributor file would be the same.

2. I would prefer to put Biopython 2.0 into a separate package (rather 
than put the subpackage "v2" in the existing package). this way a user 
has the choice to only install Biopython 2.0 without 1.x. If a user 
wants to use both packages, he can install both packages and use them 
interchangeably, since 1.x uses the package name "Bio." and 2.0 uses 
"biopy"/"biopython".

3. Although we plan to drop Python 2.x support not before 2020, we could 
drop it now for Biopython 2.0. Python 2.x user could use Biopython 1.x. 
I think dropping Python 2.0 support could be a great thing for Biopython 
2.0 development, since we do not have to care and test for compatibility.

4. I agree with you, that we should drop Jython support.

5. Regarding the documentation, I agree that the API is more important 
than writing a tutorial.

6. My proposed import structure won't import anything into the top level 
namespace. For example in case of my structure subpackage 
(padix-key/biopython) you need for example to type "from Bio.structure 
import AtomArray, superimpose" to get the tools to superimpose two 
structures. I think this is more convenient than "from 
Bio.structure.superimpose import superimpose" and "from 
Bio.structure.atoms import AtomArray".

7. I'd still prefer to put modules for biological data files in the 
subpackage "files" (as proposed in the PDF), since this organisation 
allows for putting the base class "File" into a fitting place and it is 
further possible to add modules for files that are not represented by 
any other subpackage (sequence, structure, etc). Therefore I think this 
approach has a better potential for extensibility.

8. I welcome the idea of extending the Biopython core with extensions, 
but I also think that the Biopython core should be quite comprehensive, 
i.e. including all modules that fit into the proposed new Biopython 
package structure. Biopython extensions could be an excellent idea to 
include/port existing packages with incompatible licenses.

9. I personally do not use "pandas" and therefore I do not know if it is 
neccessary to add the package to the dependencies, but if this finds 
consensus it is fine for me.

10. I suggest another dependency: The "requests" package provides 
utility for http requests, which can be useful for usage of online 
applications as well as database requests.

Best regards,
Patrick


More information about the Biopython-dev mailing list