[Biopython-dev] [Bug 3166] New: Bio.PDB.DSSP fails to work on PDBs with HETATM
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Wed Jan 5 17:33:13 UTC 2011
http://bugzilla.open-bio.org/show_bug.cgi?id=3166
Summary: Bio.PDB.DSSP fails to work on PDBs with HETATM
Product: Biopython
Version: 1.50
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: macrozhu+biopy at gmail.com
Hi,
I am current using BioPython 1.50. It seems Bio.PDB.DSSP fails if the input PDB
file contains HETATM. For example, for PDB entry 3jui, the DSSP.__init__()
function breaks with an exception:
KeyError: (' ', 547, ' ')
This is because residue 547 has id ('H_MSE', 547, ' '). But the function
Bio.PDB.DSSP.make_dssp_dict() never fill the het field in residue id when
parsing DSSP output. See line 135 in DSSP.py:
res_id=(" ", resseq, icode)
As a matter of fact, there is no way to figure out the het value from DSSP
output. Therefore, to address this issue, I suggest to revise the function
DSSP.__init__() so that it looks like below (revised lines marked with comments
# REVISED):
class DSSP(AbstractResiduePropertyMap):
"""
Run DSSP on a pdb file, and provide a handle to the
DSSP secondary structure and accessibility.
Note that DSSP can only handle one model.
Example:
>>> p=PDBParser()
>>> structure=parser.get_structure("1fat.pdb")
>>> model=structure[0]
>>> dssp=DSSP(model, "1fat.pdb")
>>> # print dssp data for a residue
>>> secondary_structure, accessibility=dssp[(chain_id, res_id)]
"""
def __init__(self, model, pdb_file, dssp="dssp"):
"""
@param model: the first model of the structure
@type model: L{Model}
@param pdb_file: a PDB file
@type pdb_file: string
@param dssp: the dssp executable (ie. the argument to os.system)
@type dssp: string
"""
# create DSSP dictionary
dssp_dict, dssp_keys=dssp_dict_from_pdb_file(pdb_file, dssp)
dssp_map={}
dssp_list=[]
# Now create a dictionary that maps Residue objects to
# secondary structure and accessibility, and a list of
# (residue, (secondary structure, accessibility)) tuples
for key in dssp_keys:
chain_id, res_id=key
chain=model[chain_id]
####################
### REVISED
####################
# in DSSP, HET field is not considered
# thus HETATM records may cause unnecessary exceptions
# e.g. 3jui.
try:
res=chain[res_id]
except KeyError:
found = False
# try again with all HETATM
# consider resseq + icode
res_seq_icode = ('%s%s' % (res_id[1],res_id[2])).strip()
for r in chain:
if r.id[0] != ' ':
r_seq_icode = ('%s%s' % (r.id[1],r.id[2])).strip()
if r_seq_icode == res_seq_icode:
res = r
found = True
break
if not found:
raise KeyError(res_id)
####################
### REVISED FINISHES
####################
aa, ss, acc=dssp_dict[key]
res.xtra["SS_DSSP"]=ss
res.xtra["EXP_DSSP_ASA"]=acc
# relative accessibility
resname=res.get_resname()
try:
rel_acc=acc/MAX_ACC[resname]
if rel_acc>1.0:
rel_acc=1.0
except KeyError:
rel_acc = 'NA'
res.xtra["EXP_DSSP_RASA"]=rel_acc
# Verify if AA in DSSP == AA in Structure
# Something went wrong if this is not true!
resname=to_one_letter_code[resname]
if resname=="C":
# DSSP renames C in C-bridges to a,b,c,d,...
# - we rename it back to 'C'
if _dssp_cys.match(aa):
aa='C'
####################
### REVISED
####################
if not (resname==aa or (res.id[0] != ' ' and aa=='X')):
####################
### REVISED FINISHES
####################
raise PDBException("Structure/DSSP mismatch at "+str(res))
dssp_map[key]=((res, ss, acc, rel_acc))
dssp_list.append((res, ss, acc, rel_acc))
AbstractResiduePropertyMap.__init__(self, dssp_map, dssp_keys,
dssp_list)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list