[Biopython-dev] [Bug 2626] New: Bio.PDB mmCIFParser parse exceptions
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Thu Oct 23 23:03:09 EDT 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2626
Summary: Bio.PDB mmCIFParser parse exceptions
Product: Biopython
Version: 1.48
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Other
AssignedTo: biopython-dev at biopython.org
ReportedBy: cjoldfield at gmail.com
I recently ran the mmCIFParser object over all of PDB's mmCIF files and found a
large number of files failed to parse correctly (a short script at the end to
demonstrate). Of ~50k mmCIF files, 3891 files failed to parse and another 1980
were missing fields in the mmCIF dictionary.
A few examples of files that failed to parse:
http://www.rcsb.org/pdb/files/1alw.cif.gz
http://www.rcsb.org/pdb/files/1det.cif.gz
http://www.rcsb.org/pdb/files/1tmy.cif.gz
A few with missing fields:
http://www.rcsb.org/pdb/files/1mfl.cif.gz
http://www.rcsb.org/pdb/files/1tfj.cif.gz
http://www.rcsb.org/pdb/files/1zn8.cif.gz
The problem seems to be that an error in one mmCIF table, like an extra field,
seems to propogate through the rest of the parse.
x86_64 gentoo linux 2008, src BioPython install
__CODE__
import sys
from Bio.PDB import *
if len(sys.argv) != 2:
print "usage: mmCifParseCheck.py <structFile>"
sys.exit(0)
structFile = sys.argv[1]
resultString = "";
#parse to structure object
numRes = 0
parser=MMCIFParser()
try:
structure=parser.get_structure('test',structFile)
for model in structure:
for chain in model:
for residue in chain:
if(residue.id[0][:2] != "H_"):
numRes += 1
except:
resultString += "parse to structure object failed\n";
else:
resultString += "parse to structure object succeeded\n";
#parse whole mmCIF file to dict
try:
mmcif_dict=MMCIF2Dict.MMCIF2Dict(structFile)
except:
resultString += "parse to dict failed\n";
else:
resultString += "parse to dict succeeded\n";
#get a required entry
try:
id = mmcif_dict['_entry.id']
except:
resultString += "key lookup failed\n";
else:
resultString += "key lookup succeeded\n";
print resultString
print "number of non-het residues " + str(numRes)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list