[Biopython-dev] mmCIF parser added to Bio.PDB

Thomas Hamelryck thamelry at vub.ac.be
Fri Oct 10 08:14:30 EDT 2003

Hi everybody,

Due to popular demand (by Cath Lawrence :-), I've added mmCIF support
to Bio.PDB. mmCIF in short is a file format that is used to describe crystal
structures. The mmmCIF format solves many problems that are associated with 
the older PDB format (or at least that's what I'm told :-). 


>>> from Bio.PDB.MMCIFParser import MMCIFParser
>>> parser=MMCIFParser()
>>> structure=parser.get_structure("test", "1FAT.cif")

In addition, there is also MMCIF2Dict, which makes the contents of an
mmCIF file available as a Python dictionary (with the data tags as keys),
so you can easily address all data in the mmCIF file.


>>> from Bio.PDB.MMCIF2Dict import MMCIF2Dict
>>> d=MMCIF2Dict("1FAT.cif")

>>> print d["_database_PDB_matrix.entry_id"]

>>> print d["_struct_site.id"]
['CAA', 'MNA', 'CAB', 'MNB', 'CAC', 'MNC', 'CAD', 'MND']

>>> d["_computing.structure_solution"]
"'X-PLOR 3.1'"

The modules use C/Lex code to parse the file, so it's reasonably fast. Note 
that compilation requires C and GNU Lex (ie. Flex). There is no support for 
writing mmCIF files, and I'm not planning to work on that either. I'd be 
interested to hear about possible bugs, requested feactures etc, but it 
should work reasonably as is.


Thomas Hamelryck
Vrije Universiteit Brussel (VUB)

More information about the Biopython-dev mailing list