[Biopython-dev] Python 2 and 3 migration thoughts

Peter Cock p.j.a.cock at googlemail.com
Sun Sep 29 23:22:52 UTC 2013


On Sun, Sep 8, 2013 at 9:52 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sat, Sep 7, 2013 at 8:17 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>>> >
>>> > # This file targets both Python 2 and Python 3 at the same time
>>> > # TODO - Targets Python 2 only (use 2to3 to run under Python 3)
>>> >
>>>
>>> Does the special comment line seem like a good solution?
>>> On the plus side, it tracks any changes with the file being
>>> updated (which wouldn't happen with a list in the do2to3.py
>>> file).
>>
>> Hi Peter,
>>
>> This looks like a good way to move forward overall. Regarding the special
>> comment  lines -- since these are only used in do2to3.py, would it be
>> cleaner to keep a hard-coded list of filenames in do2to3.py and leave the
>> modules and scripts alone? Are there any characteristics that would make it
>> difficult to determine whether a given module or script is Py3-compliant?
>
> Hi Eric,
>
> There are import time problems which are easy to spot - in particular
> SyntaxError is a good clue. However, many of the issues are only
> really found at run time (e.g. different method names). This means
> that the tests (which I started with) are actually the easiest to check.
>
> Right now I don't have a feel for what fraction of the main Bio/* and
> BioSQL/* files can be made dual-coding, and that would have an
> influence on how best to tag things needing 2to3 or not. I'm happy
> to continue this on branches for a while longer and find out.

Assuming my methodology isn't flawed, we're about half way
in terms of getting every file in Biopython do be dual Python 2
and Python 3 code:

262 no change, 290 need fixers
Troublesome ones at 52.5%

This is based on there being a difference between the pre-
and post-2to3 conversion (discounting removing future imports)
This is an over estimate as often the 2to3 script makes
unnecessary changes.

This is after applying a *lot* of little changes to our codebase,
things like removing unneeded use of my_dict.keys() which
the 2to3 fixers are over cautious in wrapping as list(my_dict.keys())
- I would like to do a beta before the next release.

> I do like the idea of a special #TODO comment line where 2to3
> is still needed - it is symbolic of where I want the code base to go ;)

That's what is going on in this revised branch - if the special
#TODO comment is there, then 2to3 is used, otherwise we
assume the file is already OK to use under Python 3:

https://github.com/peterjc/biopython/tree/mark2to3

This is now quicker to install under Python 3, but there is
plenty of scope for speed optimisation (e.g. requiring the
magic marker is in the first (say) 20 lines of the file, and
expanding the magic marker to list the specific 2to3 fixers
required and running just those.

Regards,

Peter



More information about the Biopython-dev mailing list