[Biopython-dev] Optimization of PDBParser and friends

Peter Cock p.j.a.cock at googlemail.com
Tue Sep 4 05:56:55 UTC 2012

On Mon, Sep 3, 2012 at 11:07 PM, João Rodrigues <anaryin at gmail.com> wrote:

> One big change I would propose is to eliminate the duality
> child_list/child_dict. I think that keeping child_dict and generating
> child_list from sorted dict keys would be good enough. OrderedDict also
> looks appropriate, but it's Py2.7+.. Still need to look into this, but by
> looking at all those "append" methods in the profiling it hints at a nice
> speed up, and also at much cleaner code.

Where there are back-ports of the OrderedDict and other useful
classes like NamedTuple, we could probably include these as
part of our Python 2/3 compatibility code. i.e. In Bio.PDB use:

from Bio._py3k import OrderedDict

(Until we drop older versions of Python which don't come with
this). In Bio._py3k we would have something like this:

#Use in preference system OrderedDict (Python 2.7 and 3.x),
#the backport from PyPI, or our own bundled implementation
    from collections import OrderedDict
except ImportError:
        #Whatever http://pypi.python.org/pypi/ordereddict uses:
        from xxx import OrderedDict
    except ImportError:
        #Import local bundled implementation, e.g.
        from _ordereddict import OrderedDict

See http://code.activestate.com/recipes/576693-ordered-dictionary-for-py24/

Are there any objections to this plan?



More information about the Biopython-dev mailing list