[Biopython-dev] Parsing "element" out of PDB file

Thu Jun 24 08:32:46 UTC 2010

On Tue, Jun 22, 2010 at 8:25 PM, João Rodrigues <anaryin at gmail.com> wrote:
> Hello all,
>
> I've been using some non-standard pdb files outputted by some programs and
> they miss the chemical element column in each ATOM line. ... This would be no
> problem at all, but I've added a "mass" attribute to the Atom object defined like this:
>
>        self.mass = IUPACData.atom_weigths[element]
>
> I've added the ? to the atom_weights list as I thought it would deal with
> the empty element cases.

I wonder if using None or NAN would be better than zero here? Or just an
exception. This is difficult for me to say without a better idea of what you
will be using the atomic weights for.

On a separate point, if you have an old fashioned PDB file without the element
column, you can probably work out the element anyway. For example CA in
a normal amino acids residue means the alpha carbon, so the element is
carbon (although in a HETATM there is a possibility it is Calcium I think).
So I think it would be possible to infer the element in many cases (but not
all). However, this is going to be a reasonable amount of work to write and
test. How common are this kind of PDB file for the work you are doing - do
many modelling packages omit the element?

Have you contacted the program authors to request they include the
element column in future?

Peter