[Biopython] writing a PDB----PLEASEEEE HELP
Nathan Edwards
nje5 at georgetown.edu
Wed Oct 27 14:27:56 UTC 2010
I needed (well decided that I wanted) to figure this out for the Bio.PDB
lecture in my Python/Bioinformatics course a couple of weeks ago.
import Bio.PDB
parser = Bio.PDB.PDBParser()
structure1 = parser.get_structure("2WFI","2WFI.pdb")
structure2 = parser.get_structure("2GW2","2GW2.pdb")
# Quickly/easily extract amino-acid residue chains
ppb=Bio.PDB.PPBuilder()
# Figure out how the query and subject peptides correspond...
# Done manually based on sequence alignment here, but non-trivial in
# general.
# query has an extra residue at the front
# subject has two extra residues at the back
# Assume first peptide is the right one in each case,
# (presumes) single chain models.
query = ppb.build_peptides(structure1)[0][1:]
subject = ppb.build_peptides(structure2)[0][:-2]
# Get C alpha atoms for each
qatoms = [ r['CA'] for r in query ]
satoms = [ r['CA'] for r in subject ]
# Superimpose...
superimposer = Bio.PDB.Superimposer()
superimposer.set_atoms(qatoms, satoms)
print "Query and subject superimposed, RMS:", superimposer.rms
# Apply transformation to structure2
superimposer.apply(structure2.get_atoms())
# Magic - write two structures to one file as two models
# This PDBIO usage is not documented, but can be found
# embedded in Bio.PDB.PDBIO.py
#
outfile=open("out.pdb", "w")
io=Bio.PDB.PDBIO(1)
io.set_structure(structure1)
io.save(outfile)
io.set_structure(structure2)
io.save(outfile, write_end=1)
outfile.close()
I wanted to do this in order to visualize multiple proteins's alignments
in PyMol, which I do not know well. Perhaps there is a PyMol specific
way to show two PDB files (pre-aligned or aligned by PyMol) to visualize
residues where two chains' structures do not line up.
It is possible to use the above approach, but it is necessary to set a
special PyMol attribute to see both models.
Load in the out.pdb file in PyMol, select menu item: Setting -> "Edit
All..." and set "all_states" to "on" (second from the top).
Hope this helps,
- n
--
Dr. Nathan Edwards nje5 at georgetown.edu
Department of Biochemistry and Molecular & Cellular Biology
Georgetown University Medical Center
Room 1215, Harris Building Room 347, Basic Science
3300 Whitehaven St, NW 3900 Reservoir Road, NW
Washington DC 20007 Washington DC 20007
Phone: 202-687-7042 Phone: 202-687-1618
Fax: 202-687-0057 Fax: 202-687-7186
More information about the Biopython
mailing list