[Biopython-dev] [Bug 3166] New: Bio.PDB.DSSP fails to work on PDBs with HETATM

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Jan 5 17:33:13 UTC 2011


http://bugzilla.open-bio.org/show_bug.cgi?id=3166

           Summary: Bio.PDB.DSSP fails to work on PDBs with HETATM
           Product: Biopython
           Version: 1.50
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: macrozhu+biopy at gmail.com


Hi, 

I am current using BioPython 1.50. It seems Bio.PDB.DSSP fails if the input PDB
file contains HETATM. For example, for PDB entry 3jui, the DSSP.__init__()
function breaks with an exception:

KeyError: (' ', 547, ' ')

This is because residue 547 has id ('H_MSE', 547, ' '). But the function
Bio.PDB.DSSP.make_dssp_dict() never fill the het field in residue id when
parsing DSSP output. See line 135 in DSSP.py:

res_id=(" ", resseq, icode)

As a matter of fact, there is no way to figure out the het value from DSSP
output. Therefore, to address this issue, I suggest to revise the function
DSSP.__init__() so that it looks like below (revised lines marked with comments
# REVISED):


class DSSP(AbstractResiduePropertyMap):
    """
    Run DSSP on a pdb file, and provide a handle to the 
    DSSP secondary structure and accessibility.

    Note that DSSP can only handle one model.

    Example:
        >>> p=PDBParser()
        >>> structure=parser.get_structure("1fat.pdb")
        >>> model=structure[0]
        >>> dssp=DSSP(model, "1fat.pdb")
        >>> # print dssp data for a residue
        >>> secondary_structure, accessibility=dssp[(chain_id, res_id)]
    """
    def __init__(self, model, pdb_file, dssp="dssp"):
        """
        @param model: the first model of the structure
        @type model: L{Model}

        @param pdb_file: a PDB file
        @type pdb_file: string

        @param dssp: the dssp executable (ie. the argument to os.system)
        @type dssp: string
        """
        # create DSSP dictionary
        dssp_dict, dssp_keys=dssp_dict_from_pdb_file(pdb_file, dssp)
        dssp_map={}
        dssp_list=[]

        # Now create a dictionary that maps Residue objects to 
        # secondary structure and accessibility, and a list of 
        # (residue, (secondary structure, accessibility)) tuples
        for key in dssp_keys:
            chain_id, res_id=key
            chain=model[chain_id]

            ####################
            ### REVISED
            ####################
            # in DSSP, HET field is not considered
            # thus HETATM records may cause unnecessary exceptions
            # e.g. 3jui.
            try:
                res=chain[res_id]
            except KeyError:
                found = False
                # try again with all HETATM
                # consider resseq + icode
                res_seq_icode = ('%s%s' % (res_id[1],res_id[2])).strip()
                for r in chain:
                    if r.id[0] != ' ':
                        r_seq_icode = ('%s%s' % (r.id[1],r.id[2])).strip()
                        if r_seq_icode == res_seq_icode:
                            res = r
                            found = True
                            break

                if not found:
                    raise KeyError(res_id)

            ####################
            ### REVISED FINISHES
            ####################

            aa, ss, acc=dssp_dict[key]
            res.xtra["SS_DSSP"]=ss
            res.xtra["EXP_DSSP_ASA"]=acc
            # relative accessibility
            resname=res.get_resname()
            try:
                rel_acc=acc/MAX_ACC[resname]
                if rel_acc>1.0:
                    rel_acc=1.0
            except KeyError:
                rel_acc = 'NA' 
            res.xtra["EXP_DSSP_RASA"]=rel_acc
            # Verify if AA in DSSP == AA in Structure
            # Something went wrong if this is not true!
            resname=to_one_letter_code[resname]
            if resname=="C":
                # DSSP renames C in C-bridges to a,b,c,d,...
                # - we rename it back to 'C'
                if _dssp_cys.match(aa):
                    aa='C'

            ####################
            ### REVISED
            ####################
            if not (resname==aa or (res.id[0] != ' ' and aa=='X')):
            ####################
            ### REVISED FINISHES
            ####################
                raise PDBException("Structure/DSSP mismatch at "+str(res))
            dssp_map[key]=((res, ss, acc, rel_acc))
            dssp_list.append((res, ss, acc, rel_acc))
        AbstractResiduePropertyMap.__init__(self, dssp_map, dssp_keys,
dssp_list)


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list