[Biopython-dev] Questions on StructureBuilder, MMCIFParser, and MMCIFlex

Peter biopython at maubp.freeserve.co.uk
Sun Nov 1 16:28:50 EST 2009


On Sun, Nov 1, 2009 at 7:50 PM, Paul B <tallpaulinjax at yahoo.com> wrote:
>
> Hi,
>
> I'm a computer science guy trying to figure out some chemistry logic
> to support my thesis, so bear with me! :-) To sum it up, I'm not sure
> MMCIFParser is handling ATOM and MODEL records correctly
> because of this code in MMCIFParser:
>             if fieldname=="HETATM":
>                 hetatm_flag="H"
>             else:
>                 hetatm_flag=" "
> This causes ATOM (and potentially MODEL) records to die as seen
> in the exception below (I think!)

I'll answer that below.

> My questions are:
> 1. Am I correct the correct code is insufficient?
> 2. What additional logic beyond just recognizing whether it's a
> HETATM, ATOM or MODEL record needs to be added?
>
> Thanks!
>
> Paul
>
>
> Background:
> I understand MMCIFlex.py et cetera is commented out in the
> Windows setup.py package due to difficulties compiling it.

It is commented out (on all platforms) because we don't know
how to get setup.py to detect if flex and the relevant headers
are installed, which we would need to compile the code. I'm
note sure how this would work on Windows with an installer
(i.e. what is a run time dependency versus compile time).

> So I re-wrote MMCIFlex strictly in Python to emulate what

Now that would be very handy (IMO), if you can get it working.
Have you benchmarked it against the flex code?

> I THINK the original MMCIFlex did. My version processes
> a .cif file one line at a time (readline()) then passes tokens
> back to MMCIF2Dict at  each call to get_token(). That
> seems to work fine for unit testing of my MMCIFlex and
> MMCIFDict which I had to slightly re-write (to ensure it
> handled how I passed SEMICOLONS line back etc).
>
> However when I try and use this with MMCIFParser
> against the 2beg .cif file which has no HETATM records
> and, as I understand the definition, no disordered atoms
> I get:
>
> ...
>
> Basically what I think MIGHT be happening is MMCIFParser
> is currently only handling HETATM records, when some other
> kind of record comes in (ATOM, MODEL) it is treated
> incorrectly. See below.
>
> ...

Have you been able to test the flex code? If not, could you
give me a tiny script using the 2beg cif file which should
work? If that works, then the problem is in your flex
replacement code.

Peter



More information about the Biopython-dev mailing list