[BioPython] Extracting residue list from PDB

Fri Oct 28 12:07:53 EDT 2005

Hi Gad,

The reason the chains seem to be of the wrong length, is that they are 
generated from the ATOM records, rather than the SEQRES records. Those 
disagree often enough in PDB files.
This problem does not exists in mmCIF, I believe.

Using your code, I got only six chains, so I cannot comment on the 
second problem.

Best,

./I

Gad Abraham wrote:

>Hi,
>
>I'm trying to extract a FASTA-like list of residues from a PDB file. It
>doesn't seem to work correctly for some (e.g. 1n62, which comes out as
>10 chains while it only has 6, and chain lengths are wrong too).
>
>I'm using the following script based on the Structural Biopython FAQ:
>
>#!/usr/bin/python
>
>from Bio.PDB import *
>import sys
>
>parser = PDBParser()
>structure = parser.get_structure(sys.argv[1], sys.argv[1])
>
>ppb = PPBuilder()
>for pp in ppb.build_peptides(structure):
>   print len(pp),pp.get_sequence().tostring()
>   print 
>
>
>Any tips would be appreciated.
>
>Thanks,
>Gad
>  
>

-- 
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037 USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 713 9949
http://ffas.ljcrf.edu/~iddo