[Biopython-dev] flex, setup.py and Bio.PDB.mmCIF (Bug 2619)

Michiel de Hoon mjldehoon at yahoo.com
Fri Mar 15 15:22:30 UTC 2013

Hi João,

--- On Fri, 3/15/13, João Rodrigues <anaryin at gmail.com> wrote:
What we perhaps should have it the parsers, whatever they are, populating the same type of object in the end (PDBParser and mmCIFParser).
I think that there are two options:

1) PDBParser and mmCIFParser both produce Structure objects, with any additional information found in mmCIF files stored as additional attributes of Structure objects (and the same thing for PDB files);

2) We make a module mmCIF with a function mmCIF.read that reads an mmCIF file and stores the information in a mmCIF.Record object that is optimized for storing mmCIF information. The mmCIFParser uses mmCIF.read, and pulls out the necessary information from the mmCIF.Record object to create a Structure object (which is free of mmCIF-specific stuff). Users can make Structure objects if that is all they need, or use mmCIF.read if they want to have all information in an mmCIF file.

Currently the situation is closer to (2), with MMCIF2Dict playing the role of mmCIF.read, but I don't like much the way MMCIF2Dict stores information.

Since I am not a power user of Bio.PDB, other people may have more insight in whether (1) or (2) (or something completely different) is best. 
Is this the current status of the mmCIF?
I just replaced the flex-dependent part of mmCIF by pure Python code, but I didn't change the functionality or usage of the mmCIF code. So the current status is still the same as described in the documentation.

Speaking of which, we have a Biopython Structural Bioinformatics FAQ (i.e. how to use the Bio.PDB module) on the Biopython website with additional information on Bio.PDB, including some information on things that are not in the main Biopython Tutorial. Perhaps this is a good time to integrate this FAQ into the main documentation?


More information about the Biopython-dev mailing list