[BioPython] protein-ligand interactions

Peter Cock p.j.a.cock at googlemail.com
Thu Mar 19 13:31:30 UTC 2009


On Thu, Mar 19, 2009 at 12:55 PM, mitlox <mitlox at op.pl> wrote:
> I wrote this code:
> ------------------------------------------------------------------------------------------------
> import Bio.PDB
> import numpy
>
> pdb_code = "1E8W"
> pdb_filename = "1E8W.pdb" #not the full cage!

That comment was about the fact that the PDB file 1XI4 only contains
part of the full clathrin cage.

> structure = Bio.PDB.PDBParser().get_structure(pdb_code, pdb_filename)
> backBoneAtomNames = "N","CA","C","0", "CB"
> ...
> ------------------------------------------------------------------------------------------------
> to identified the backbone, but unfortunately it does not work.
>
> Maybe exist already to identified backbone in Biopython?

I don't understand what you were trying to do. Have you read the
Bio.PDB documentation about the hierarchy of structures, models,
chains, residues and atoms?
http://biopython.org/DIST/docs/cookbook/biopdb_faq.pdf

This is how I would solve the original question, finding the distance
between the C-alpha carbon to the closest atom is the ligand:

import Bio.PDB
import numpy

pdb_code = "1E8W"
pdb_filename = "1E8W.pdb"

structure = Bio.PDB.PDBParser().get_structure(pdb_code, pdb_filename)
model = structure[0]
chainA = model["A"]

def residue_dist_to_ligand(protein_residue, ligand_residue) :
    """Returns distance from the protein C-alpha to the closest ligand atom."""
    distances = []
    for atom in ligand_residue :
        diff_vector  = protein_residue["CA"].coord - atom.coord
        distances.append(numpy.sqrt(numpy.sum(diff_vector * diff_vector)))
    return min(distances)

#From looking at the PDB file, ligand is last residue in chain A, named QUE
ligand_res = chainA.child_list[-1]
assert ligand_res.resname == "QUE"
for protein_res in chainA.child_list[:-1] :
    dist = residue_dist_to_ligand(protein_res, ligand_res)
    if dist < 5.0 :
        print protein_res.resname, protein_res.id[1], dist

This gives the following output:

ILE 881 3.64203
VAL 882 3.58559
ALA 885 4.62673
THR 886 4.95211
ILE 963 4.64252
ASP 964 3.08788

If you wanted to, it should be simple change this to find the closest
distance between any part of each residue to any part of the ligand,
which should I expect give some distances less than 3A.

Peter



More information about the Biopython mailing list