[Biopython-dev] Python 2 and 3 migration thoughts

Peter Cock p.j.a.cock at googlemail.com
Fri Sep 6 15:44:44 UTC 2013


On Thu, May 30, 2013 at 2:33 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Splitting off from this thread:
> http://lists.open-bio.org/pipermail/biopython/2013-May/008601.html
>
> On Thu, May 30, 2013 at 2:13 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Thank you for all the comments so far, don't stop yet :)
>>
>> On Thu, May 30, 2013 at 1:51 PM, Wibowo Arindrarto
>> <w.arindrarto at gmail.com> wrote:
>>> Hi everyone,
>>>
>>> I'm leaning towards insisting on Python >=3.3 support (I'm running
>>> 3.3.2). I suppose that even if Python3.3 is not available on a machine
>>> or through the default package manager, it's always installable on its
>>> own. If that's not the case, I imagine Python2.x is most likely
>>> present in these machines (so Biopython can still be used).
>>
>> True.
>>
>> So far everyone who has replied (including some off list) have said
>> they are using Python 3.3 which is encouraging. Thank you for
>> the comments so far.
>>
>> It looks like we can forget about Python 3.1, and just need to
>> decide if it is worth including Python 3.2.5 in the short term.
>>
>>> On a related note, do we have a defined timeline on when we
>>> would drop support for Python2.x? Are there any plans to have
>>> our codebase written in Python3.x instead of Python2.x?
>>
>> Nothing concrete planned, no. I'll reply in more detail on the
>> biopython-dev list as I do have some thoughts about this.
>
> Good question Bow,
>
> I think people will still be using Python 2 a year or two from
> now, so we must support both for some time.
>
> Biopython 1.62 (next week perhaps?)
> - Final release with Python 2.5 support
> - Official support for Python 2.5, 2.6, 2.7 and 3.3
> - Possibly official support for Python 3.2.5+ as well?
>
> (Exactly which versions of Python 3 we'll include to be
> decided, see the other thread for that discussion.)
>
> Short term we will continue with developing using Python 2
> syntax and running 2to3 for Python 3. As far as I know,
> the reverse process with 3to2 is not well established. If
> anyone wants to investigate that would be useful as
> another option. However, dropping Python 2.5 support
> makes things more flexible...
>
> Medium term I believe it would be possible to have a single
> code base which is both valid Python 2 and 3 at the same
> time. This may require us to target 2.7 and 3.3+ only - we'll
> have to try it and see if Python 2.6 will hold us back.
>
> I've actually done this with lzma.backports, a small but
> non-trivial module with Python and C code:
>
> https://pypi.python.org/pypi/backports.lzma/
> https://github.com/peterjc/backports.lzma
>
> Python 3.3 reintroduces some features designed to make
> this more straightforward, like unicode literals (missing in
> the early versions of Python 3). This is why I'd like to drop
> Python 3.2 as soon as possible.
>
> What I was thinking is we can start migrating modules on a
> case by case basis from "Python 2 syntax" to "Dual syntax"
> one by one, with a white-list in the do2to3.py script. That
> way over time less and less modules need to be converted
> via 2to3, and "python3 setup.py install" will get faster,
> until eventually we can stop using 2to3 at all.
>
> This conversion could consider the code and doctests
> separately. However, using using print(example) we can
> hopefully get most of the doctests and Tutorial examples
> to work under both Python 2 and 3 at the same time.
>
> That's my current thinking anyway - and I think the fact
> that it would be a gradual migration from writing Python 2
> specific code to writing dual 2/3 code makes it low risk
> (as long as we're continuing to run regular testing).
>
> Regards,
>
> Peter

This branch is trying out marking individual Python files
as dual coding (Python 2 and Python 3) or as Python 2
only requiring conversion via 2to3 for use on Python 3:

https://github.com/peterjc/biopython/tree/tag2to3

Currently the tags are two special hash comment lines
expected near the start of the file itself (rather than a
list within the do2to3.py script). The actual text of the
marker isn't critical - perhaps these need full stops?

# This file targets both Python 2 and Python 3 at the same time
# TODO - Targets Python 2 only (use 2to3 to run under Python 3)

The first main issues thus far have been print statements,
where we will either need to use the __future__ import or
restrict ourselves to simple single argument calls - I have
been using the later. This should not be a big problem on the
main code, and we ought to update the print-and-compare
unit tests anyway,

The next common issue is import statements, for
example StringIO (another bytes versus unicode issue).
That can be handled via Bio._py3k in some cases.

A third major class of issues in the unit tests so
far is iterators versus lists, for example dictionary
methods and the map function's return value. These
can be tackled on a case by case basis I think - often
by adding the occasional list(...) or sorted(x) instead
of trying x.sorted() is enough.

There are also quite a few instances of 'basestring'
which might be handled via _py3k?

As of right now, on this branch there are only 8 files under
Tests which require conversion via 2to3 :

Tests/common_BioSQL.py
Tests/seq_tests_common.py
Tests/test_NCBI_qblast.py
Tests/test_SCOP_Cla.py
Tests/test_seq.py
Tests/test_SeqIO.py
Tests/test_SeqIO_index.py
Tests/test_Uniprot.py

Having I hope demonstrated this will work, I'd like some
feedback before applying this (or a modified version of
it) to the master branch.

Any thoughts? Thanks,

Peter



More information about the Biopython-dev mailing list