[Biopython-dev] Dependency policy; was PEP8 lower case module names?

Peter Cock p.j.a.cock at googlemail.com
Sun Nov 4 20:49:58 UTC 2012


On Sunday, November 4, 2012, Eric Talevich wrote:

> On Sun, Nov 4, 2012 at 9:01 AM, Peter Cock <p.j.a.cock at googlemail.com<javascript:_e({}, 'cvml', 'p.j.a.cock at googlemail.com');>
> > wrote:
>
>> Retitling thread
>>
>> On Sun, Nov 4, 2012 at 1:09 PM, Tiago Antão <tiagoantao at gmail.com<javascript:_e({}, 'cvml', 'tiagoantao at gmail.com');>>
>> wrote:
>> > Hi,
>> >
>> >
>> > On Fri, Nov 2, 2012 at 4:01 PM, Michiel de Hoon <mjldehoon at yahoo.com<javascript:_e({}, 'cvml', 'mjldehoon at yahoo.com');>>
>> wrote:
>> >>
>> >> Already I feel that we need to install too many packages to get going
>> with
>> >> Python in bioinformatics (Python itself, NumPy, Matplotlib and its
>> >> dependencies, Pysam, Cython (needed to compile Pysam), ezsetup, perhaps
>> >> SciPy, Biopython). I find this hard to explain to people new to
>> >> bioinformatics or new to Python. So I would prefer to keep one
>> distribution.
>> >>
>> >> We can be more lenient in terms of dependencies, especially those that
>> >> don't occur at compile time.
>> >>
>> >
>> > One of the things that I always found lacking with biopython is a clear,
>> > consistent policy on dependencies:
>>
>> It would be good to have something written down, just as we
>> did with the deprecation policy.
>>
>
> Should we start a page for this on the wiki?
>
>
The wiki is online again now :)

Maybe agree a draft by email first?


> > Depending on the mood of the day it could be either good/bad
>> > to add a library dependency. As an example, this ended up
>> > with there being a dependency on reportlab, but not on scipy.
>>
>> The ReportLab dependency is a 'run time only' dependency and
>> has been in Biopython for a very long time. You'd have to remind
>> me if there was any compile time issue with scipy, but my
>> recollection was we were loath to add a dependency on scipy
>> (which is quite a complex library to install if not using a package)
>> for just one or two functions - however you were planning something
>> more substantial in the PopGen code which would justify it (using
>> lots of statistics).
>>
>> > Whatever the policy, I think that is should be consistent all across.
>> > Preferably simple to both users and developers.
>> >
>> > A few ideas on policy:
>> >
>> > 1. I totally agree with the the idea of being as lenient as possible
>> with
>> > dependencies (as you say, especially with those that do not occur at
>> > compile time).
>> > 2. Biopython belongs to a certain software ecology. I think it would
>> make
>> > sense to see as natural adding dependencies on well established python
>> > libraries.
>> > 3. (1+2) If a developer wants to add a dependency on a package, that
>> should
>> > not be a major problem (as long as the package is maintained for
>> long/well
>> > known/stable). Users should only have to deal with the dependency if
>> they
>> > need the functionality that depends on that package.
>> >
>> > Python being a dynamic language, there does not have to be a burden on
>> > users/developers if a remote part of Biopython depends on something more
>> > exotic (which most users/developers will never see/install in any case).
>> > Again by "exotic" I mean well known libraries with a track record of
>> years
>> > of stability.
>>
>> That all sounds reasonable. It is compile time dependencies that I am
>> most wary of.
>>
>
> Pure-Python dependencies seem less scary -- a package like PLY should work
> on any Python, PyPy, Jython, and Google App Engine. Unfortunately, the
> dependencies that are most tempting are the ones with essential C
> extensions (numpy, scipy, matplotlib).
>

But (for example) matplotlib wouldn't be a build time dependency
for us.



> However, from an end user perspective having installed Biopython and
>> then trying a script from a colleague and only then finding 101 optional
>> run time dependencies are also needed would be annoying.
>>
>> For Linux packages like Debian there is a 'recommends' field for this kind
>> of soft dependency. Where do we stand with declaring dependencies in
>> setup.py so that if using a package manager like pip this it less painful?
>>
>> In fact, how many 'soft' dependencies like this do we already have?
>> Just from a quick look at the README file many are not mentioned
>> under the current 'System Requirements' text (e.g. Network X).
>>
>
> I just used "git grep import Bio/" to find out. The only egregious
> undocumented dependencies are the ones I added in Phylo for graphics:
> networkx and matplotlib/pylab.
>

Could you add those to the README file then?


> Other *possible* dependencies are sqlite3 in the case of Jython
>
(Bio.SeqIO._index) and ordereddict for Pythons earlier than 2.7 (Bio._py3k).
>
> Should we add these to the "install_recommends" list in setup.py?
>

No, they are in the standard lib on C Python, except in the case
of OrderedDict on older Pythons were we bundle a backport
anyway.

Jython has an open bug on including the sqlite3 module,
and might be worth mentioning under a new Jython
specific section of the README.

Peter




More information about the Biopython-dev mailing list