[Biopython] BioJava-like seqres alignment for Bio.PDB

Peter biopython at maubp.freeserve.co.uk
Wed Jun 30 13:44:10 UTC 2010


On Wed, Jun 30, 2010 at 12:04 AM, Reece Hart <reece at berkeley.edu> wrote:
> On 06/28/2010 11:36 PM, Bryan Lunt wrote:
>>
>> Does anyone have any code for easy alignment between the SEQRES entry
>> in a pdb file and the actual ATOM/HETATM entries in the chain?
>>
>> In biojava, this is just one of the options when you parse a PDB file,
>> it would certainly be useful.
>>
>
> How does BioJava do this?
>
> RCSB added this mapping explicitly in the XML formatted files several years
> ago. It looks like this:
>
>      <PDBx:pdbx_poly_seq_scheme asym_id="A" entity_id="1" mon_id="SER"
> seq_id="3">
>         <PDBx:auth_mon_id>SER</PDBx:auth_mon_id>
>         <PDBx:auth_seq_num>145</PDBx:auth_seq_num>
>         <PDBx:hetero>n</PDBx:hetero>
>         <PDBx:ndb_seq_num>3</PDBx:ndb_seq_num>
>         <PDBx:pdb_ins_code></PDBx:pdb_ins_code>
>         <PDBx:pdb_mon_id>SER</PDBx:pdb_mon_id>
>         <PDBx:pdb_seq_num>145</PDBx:pdb_seq_num>
>         <PDBx:pdb_strand_id>A</PDBx:pdb_strand_id>
>      </PDBx:pdbx_poly_seq_scheme>
>
> That is, sequence position 3 is resid position 145 in this protein.

That looks like a good reason to have a PDB XML parser (as trying to do
this from the plain text PDB is probably fiddly).

> In any case, having function that provides this mapping (both directions) in
> BioPython would be extremely useful.

Maybe something for the GSoC project TODO list? ;)

Peter




More information about the Biopython mailing list