[Biopython-dev] [Bug 2910] Bio.PDB build_peptides sometimes gives shorter peptide sequences than expected
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Thu Sep 10 12:55:03 UTC 2009
http://bugzilla.open-bio.org/show_bug.cgi?id=2910
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|critical |normal
Summary|Parsing some pdb files |Bio.PDB build_peptides
|results in shorter peptide |sometimes gives shorter
|sequences than expected |peptide sequences than
| |expected
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2009-09-10 08:55 EST -------
Retitled as this appears to be a bug in the PPBuilder build_peptides method,
not the PDB parser, see:
http://lists.open-bio.org/pipermail/biopython/2009-September/005532.html
Test script:
from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Polypeptide import PPBuilder, to_one_letter_code
parser = PDBParser()
ppb = PPBuilder()
#structure = parser.get_structure('tmp', '1A2D.pdb')
structure = parser.get_structure('tmp', '13GS.pdb')
for model in structure :
polypeptides = ppb.build_peptides(model)
assert len(model) == len(polypeptides)
for chain, pep in zip(model, polypeptides) :
print
print "Chain", chain.id
print "Raw chain:"
print "".join(to_one_letter_code.get(res.resname,"X") \
for res in chain if "CA" in res.child_dict)
print "From peptide builder:"
print pep.get_sequence()
Output for 1A2D,
PDBConstructionWarning: WARNING: Chain A is discontinuous at line 2426.
PDBConstructionWarning: WARNING: Chain B is discontinuous at line 2427.
PDBConstructionWarning: WARNING: Chain A is discontinuous at line 2428.
PDBConstructionWarning: WARNING: Chain B is discontinuous at line 2448.
Chain A
Raw chain:
CDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDLVTIRSESTFKNTEISFKLGVEFDEITADDRKVKSIITLDGGALVQVQKWDGKSTTIKRKRDGDKLVVEXVMKGVTSTRVYERA
>From peptide builder:
CDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDLVTIRSESTFKNTEISFKLGVEFDEITADDRKVKSIITLDGGALVQVQKWDGKSTTIKRKRDGDKLVVEXMKGVTSTRVYERA
Chain B
Raw chain:
CDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDLVTIRSESTFKNTEISFKLGVEFDEITADDRKVKSIITLDGGALVQVQKWDGKSTTIKRKRDGDKLVVEXVMKGVTSTRVYERA
>From peptide builder:
CDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDLVTIRSESTFKNTEISFKLGVEFDEITADDRKVKSIITLDGGALVQVQKWDGKSTTIKRKRDGDKLVVEXMKGVTSTRVYERA
Notice there are discontinuities in both chains A and B, and a missing residue
in their peptides.
And the output from 13GS,
PDBConstructionWarning: WARNING: Chain A is discontinuous at line 3760.
PDBConstructionWarning: WARNING: Chain B is discontinuous at line 3812.
PDBConstructionWarning: WARNING: Chain A is discontinuous at line 3852.
PDBConstructionWarning: WARNING: Chain B is discontinuous at line 3948.
PDBConstructionWarning: WARNING: Chain C is discontinuous at line 4033.
Chain A
Raw chain:
MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
>From peptide builder:
MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
Chain B
Raw chain:
PYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
>From peptide builder:
PYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ
Chain C
Raw chain:
ECG
>From peptide builder:
CG
Chain D
Raw chain:
ECG
>From peptide builder:
CG
Notice there are discontinuities in chains A, B and C, but missing residues in
the peptide chains C and D. This suggests the discontinuities are required to
trigger the problem. Also there are no HETATM residues for chains C and D.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list