[Biopython] Superimposer troubles

Willis, Jordan R jordan.r.willis at Vanderbilt.Edu
Tue Apr 2 04:40:36 UTC 2013


Hello List,


I'm having trouble working through some issues with the superimposer for all-atom superpositions. Often times, we work on protein design and our end PDB files differs in atom-number and sometimes composition from our input. I'm a big fan of the Superimposer, so we have implemented like this:

p = PDBParser()
native_pdb = p.get_structure("input","input.pdb")
designed_pdb = p.get_structure("output","output.pdb")


native_ca_atoms = []
native_all_atoms = []
designed_ca_atoms = []
designed_all_atoms = []
for (native_residue, designed_residue) in zip(native_pdb.get_residues(), designed_pdb.get_residues()):
	native_ca_atoms.append(native_residue['CA'])
	designed_ca_atoms.append(native_residue['CA']
	for (native_atom, designed_atom) in zip(native_residue.get_list(), designed_residue.get_list()):
		native_all_atoms.append(native_atom)
		designed_atom.append(designed_atom)


superpose_ca = Superimposer()
superpose_all = Superimposer()

superpose_ca.set(native_ca_atoms, designed_ca_atoms)
superpose_ca.apply(designed_pdb)
ca_rms = my_spiffy_rms_calculator(native_ca_atoms, designed_ca_atoms)


superpose_all.set(native_all_atoms, designed_all_atoms)
superpose_ca.apply(designed_pdb)
all_rms = my_spiffy_rms_calculator(native_all_atoms, designed_all_atoms)


For the CA atom residues its not really a big deal since everything we design has a CA atom. However when we go into all atoms, it turns out that the designed residue and the native residue can be different, thus leading to a different number of atoms. I didn't realize, but the zip function was making these two lists as big as the smallest list and not necessarily matching up the atoms. It would just hack off some part of the larger list!  This way, the superimposer was never failing because it always had an exact match of atoms. Is the superimposer smart enough to just minimize the rmsd no matter how the lists are input, no matter what order? For instance if I put the same arginines atoms backwards in one list, and forwards in the other list, would it still be able to give a 0.0 rmsd?

Thank you for your feedback,
Jordan

PS. Does the superimposer.rms method give back the RMSD of whatever atoms you put into it? Or is it always the CA atoms?




More information about the Biopython mailing list