[Biopython-dev] Bio.PDB on Python 3

Eric Talevich eric.talevich at gmail.com
Tue Oct 19 22:01:20 EDT 2010


On Mon, Aug 16, 2010 at 9:47 AM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> Hi all,
>
> A while back I installed NumPy from their svn under Python 3, so that I
> could test more of Biopython. I hadn't really looked at Bio.PDB until
> recently because test_PDB.py depended on Bio.KDTree which needs
> some C code to be compiled (which we haven't tried yet).
>
[...]
>
> This has revealed there are at least two issues with Bio.PDB to be
> addressed (see below).
>
[...]
>
> ======================================================================
> ERROR: test_ExposureCN (__main__.Exposure)
> HSExposureCN.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File "test_PDB.py", line 612, in setUp
>    structure=PDBParser(PERMISSIVE=True).get_structure('X', pdb_filename)
>  File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 64, in get_structure
>    self._parse(file.readlines())
>  File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 84, in _parse
>    self.trailer=self._parse_coordinates(coords_trailer)
>  File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/PDBParser.py",
> line 200, in _parse_coordinates
>    fullname, serial_number, element)
>  File "/home/xxx/lib/python3.1/site-packages/Bio/PDB/StructureBuilder.py",
> line 185, in init_atom
>    duplicate_atom=residue[name]
> TypeError: 'DisorderedResidue' object is not subscriptable
>

These errors occur when parsing Tests/PDB/a_structure.pdb under
permissive mode. In this structure, residue 3 is disordered, and that
triggers some exciting things.

The bug seems to be related to this method of DisorderedEntityWrapper
in Bio/PDB/Entity.py:

    def __getattr__(self, method):
        "Forward the method call to the selected child."
        if not hasattr(self, 'selected_child'):
            # Avoid problems with pickling
            # Unpickling goes into infinite loop!
            raise AttributeError
        return getattr(self.selected_child, method)


When running the test script, where we reach lines 185-186 in
StructureBuilder.py:

        if residue.has_id(name):
                duplicate_atom=residue[name]

it gets magical. The method 'has_id' is not defined on the
DisorderedResidue class. Instead, if residue is an instance of
DisorderedResidue (subclass of DisorderedEntityWrapper), instead of
Residue (subclass of Entity), then accessing residue.has_id on that
object calls __getattr__, which in turn calls
residue.selected_child.has_id(id).

The next line raises a TypeError in Python 3, but not in Python 2 --
residue[name] seems to find the appropriate __getitem__ implementation
in Python 2 only.

My hypothesis is that Python 2 treats this magic-method call to
residue.__getitem__ as an attribute access, allowing
DisorderedEntityWrapper.__getattr__ to forward this access to the
appropriate child, some Residue instance, which does implement
__getitem__. In Python 3, __getitem__-related syntax could be
implemented slightly differently, so it's not seen as a __getattr__
access and everything falls apart. (I could be wrong about all of
this.)

So here's what I'm doing:
 - In DisorderedEntityWrapper, implement __getitem__(self, id) such
that self.selected_child[id] is returned instead. This fixes most of
the errors but produces/uncovers three new ones. These new errors also
seem to indicate that magic methods on DisorderedEntityWrapper aren't
being handled through __getattr__ in Python 3.
 - Fix the new errors.


I'll post the patch here before pushing it upstream once I get it working.

Best,
Eric



More information about the Biopython-dev mailing list