[Biopython] BioJava-like seqres alignment for Bio.PDB

Tue Jun 29 23:04:41 UTC 2010

On 06/28/2010 11:36 PM, Bryan Lunt wrote:
> Does anyone have any code for easy alignment between the SEQRES entry
> in a pdb file and the actual ATOM/HETATM entries in the chain?
>
> In biojava, this is just one of the options when you parse a PDB file,
> it would certainly be useful.
>    

How does BioJava do this?

RCSB added this mapping explicitly in the XML formatted files several 
years ago. It looks like this:

       <PDBx:pdbx_poly_seq_scheme asym_id="A" entity_id="1" mon_id="SER" seq_id="3">
          <PDBx:auth_mon_id>SER</PDBx:auth_mon_id>
          <PDBx:auth_seq_num>145</PDBx:auth_seq_num>
          <PDBx:hetero>n</PDBx:hetero>
          <PDBx:ndb_seq_num>3</PDBx:ndb_seq_num>
          <PDBx:pdb_ins_code></PDBx:pdb_ins_code>
          <PDBx:pdb_mon_id>SER</PDBx:pdb_mon_id>
          <PDBx:pdb_seq_num>145</PDBx:pdb_seq_num>
          <PDBx:pdb_strand_id>A</PDBx:pdb_strand_id>
       </PDBx:pdbx_poly_seq_scheme>

That is, sequence position 3 is resid position 145 in this protein.

In any case, having function that provides this mapping (both 
directions) in BioPython would be extremely useful.

Thanks,
Reece