[Biopython-dev] PDB tidy script

Thomas Hamelryck thamelry at binf.ku.dk
Fri Apr 3 13:31:05 UTC 2009


Hi everybody,

> I haven't been on this list long enough to know -- is Thomas still
> > supporting the PDB module?


Yes and no. First, I've been pretty busy with establishing a group here in
Copenhagen, but it looks like I will have time for Bio.PDB again in the
future. There's for example a set of classes dealing with RNA structure
coming up. Just have to submit it.
Second, I have no interest in doing anything beyond 3D stuff. I am not going
to implement header parsing for example. I know many people have donated
code, but in general this code is very messy and ad-hoc.
The PDB parser is pretty lean, fast and quite stable now - IMO parsing the
header should be the responsibility of a helper class, in order not to
overload the 3D code with a lot of stuff that most people will not use.
Also, the header info is for most purposes quite useless, especially in PDB
files. It makes no sense to parse the PDB header in fact - if you need
header info, use the MMCIF files.


> If so, would he give his blessing to some more
> > invasive changes to the PDB module, such as unifying PDBParser and
> > parse_pdb_header? That separation has always seemed curiously vestigal to
> > me.


You could provide a uniform interface, but please keep the 3D data
processing and the header processing in separate classes! The Structure
object has functionality to be 'annotated', so you could transfer data from
the header to the Structure object easily.


> If you look back over the history, there initially was no header parsing,
> it was a contribution from Kristian Rother, and I would agree, it is rather
> disjoint from the rest of the code.  One thing I personally wanted last
> time I was working with PDB files was to have secondary structure
> information (for them alpha and beta sheet lines in the header)
> mapped onto the residue objects automatically.


This is a good example of why header parsing is something of a red herring.
You really want to recompute that using some decent program like DSSP or
PSEA, or even an internal Bio.PDB procedure. But it's fine of course if you
want to add this!

I would suggest you try and get Thomas involved now for his input
> on the design (before you start coding), but if need be press ahead
> anyway for your own use, and he can always comment on your
> public branch.  I hope the two of you can work together on this, and
> if/when Thomas does stand down (or delagate), you could then be
> in an excellent position to take over as the Bio.PDB maintainer if
> that's what you wanted.


Sure, I'm open to this, but I'd like to stay involved if the 3D stuff is
altered, even just to discuss new designs.

Cheers,

-Thomas



More information about the Biopython-dev mailing list