[Biopython-dev] Parsing "element" out of PDB file

João Rodrigues anaryin at gmail.com
Thu Jun 24 18:25:45 UTC 2010

> And the center of mass calculation was for coarse-graining structures,
> right? What would be most useful there?

> (a) Give unknown atoms a weight of 0.0, so CoM essentially disregards them

CoM counts with the number of atoms so 0.0 will not work anyways actually.

>  (b) Give unknown atoms a weight of None, and have CoM check for this and
> disregard those atoms (similar effect) -- preferably issuing a warning

I'd prefer this. Exclude atoms from the calculation. But then this might
have an impact in the location of the mass..

> (c) Like (b), but CoM raises an exception
> (d) Give CoM a keyword argument for how to treat this (e.g.
> strict=True/False), so course-graining can be permissive but direct use of
> CoM can raise an exception if desired. (However, if warnings are used then
> the warnings module already lets you convert specific warnings into
> exceptions.)

My suggestion. CoM can be either geometrical or gravitical. The first
assumes equal mass for everyone, the second does not. If there's a mass that
doesn't exist, the CoM would default to geometrical and issue a warning.
Having a flag in CoM can also be valuable but I guess this would be
redundant with the warning/exception (permissive/strict) in the Atom class.

>  >> On a separate point, if you have an old fashioned PDB file without the
>> >> element column, you can probably work out the element anyway. ...
>> >
>> > From non HETATMs its possible from the first letter of the atom name (or
>> it
>> > is H if the first letter is a digit). For HETATMs, names match elements
>> > IIRC.
>> >
>> > Do you think it's worth the try? It shouldn't be hard to write and the
>> cases
>> > where it would fail would be sporadic.
>> Eric - what do you think?
> Sounds useful to me. Where would it fail, and how should failures be
> treated? Unrecognized atom names, and then issue a warning and leave the
> element attribute blank? (See options above...)

I'd implement it in the Atom class. Instead of having this check (lines

        elif len(element)>2 or element != element.upper() or element !=
            raise ValueError(element)

there would be a check against IUPACData.atom_weight.keys(). If the element
is not found, then it would try to check the atom name and issue a warning.
If this fails, exception thrown.

Sounds good?



More information about the Biopython-dev mailing list