[BioPython] HETATM records retrieval

Sat Jul 14 18:23:07 UTC 2007

Orlando Döhring wrote:
> Dear community,
> 
> how should HETATM records be retrieved via Biopython? I assume it should be
> somewhere on the chain or residue level

In the PDB file you used, 1DHR, all the HETATM records are for solvents
(NAD = NICOTINAMIDE ADENINE DINUCLEOTIDE and HOH = water) so they don't
appear as part of the protein chains. I haven't looked at this recently
so its not fresh in my mind.chain.

When there are HETATM entries within a protein (e.g. alternative amino
acids) then they should be part of the chain.

> Using the following basic sample code :
> 
>         for model in self.structure.get_iterator():
>             for chain in model.get_iterator():
>                 print chain.__repr__()
>                 for residue in chain.get_iterator():
>                     print residue.__repr__()

You don't need to explicitly call the get_iterator() method, so I much
prefer this style myself:

      structure = ...
      for model in structure:
          for chain in model:
              print repr(chain)
              for residue in chain:
                  print repr(residue)

I've also used the repr() function rather than the hidden __repr__ of
the object; its the same end result but I find this clearer.

Have you read the example on this page? In particular the use of the
PPBuilder or CaPPBuilder classes:

http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/ramachandran/calculate/#BioPython

I also urge you to look at the author's (Thomas Hamelryck) documentation
here:

http://biopython.org/DIST/docs/cookbook/biopdb_faq.pdf

This is much more useful than the automatic API documentation you linked to.

Peter