[Bioperl-l] PDB ATOM records: name, segid, etc.

Joe Krahn krahn@niehs.nih.gov
Mon, 15 Jul 2002 17:14:15 -0400


Andrew Dalke wrote:
> 
> Kris Boulez:
> > Forgive me my ignorance (did NMR at university and left for the IT world
> > more then ten years ago): but what are these CNS files ? Is this a PDB
> > derived structure format ?
> 
> CNS is Alex Brunger's replacement for XPLOR.  I understand there's some
> licensing issues behind it, but I don't know the details.
Axel was marketing X-PLOR through MSI. They ended up getting most of the
money while he did all of the work. Axel tried to get out of the agreement
by "re-writing" X-PLOR into CNS. The plan was doomed, because the agreement
covered derivative works, so CNS would be covered even if it had no
X-PLOR code. The result is that Axel infringed on the agreement, and
MSI got full rights to software funded 90% by We The People. MSI is now
Accelrys, and sells CNX, their version of CNS, making money from software
funded by tax payers. An evil state of affairs. It's why all goverment
funded research siftware should be open source.

> 
> > Does someone know where I could find descriptions of 'older' PDB
> > formats. The current parser is written based on a document titled
> > 'Protein Data Bank Contents Guide: version 2.1 (october 25, 1996)' .
> >
> > If so I would certainly add other versions.
I think 2.1 has SEGID documented. I assumed you left it out of the
current version because later documents declared it obsolete.
But, we crystallographers would like it re-instated.

All of the formats are documented here:
http://www.rcsb.org/pdb/info.html#File_Formats_and_Standards

<snip>

> The problem with these is that they require good support for:
>    - bond types/orders
>    - aromaticity
>    - chirality
I think the more complex ones can be supported as import-only
without too much effort.

> 
> and if you are only used to dealing with PDB files you likely
> won't know how these should be handled.  Plus, the different formats
> handle them differently, in a somewhat incompatible fashion.
> 
> Another place to look for details on these formats is Babel/OpenBabel
>   http://openbabel.sourceforge.net/
Could we just use Babel as a proxy for various formats, or do we want to
avoid executable proxies?

A related issue, what do you think about incorporating bond, angle, etc.
data, and forcefield parameters?  And things like non-bonded
interaction analysis, H-bond detection, etc? We have various bits of
Perl code that we are trying to organize, and BioPerl's PDB module
would be a good place to put it.

Joe Krahn