[Biopython-dev] [Bug 3096] PPBuilder build_peptides bugs
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Fri Aug 13 18:23:24 EDT 2010
http://bugzilla.open-bio.org/show_bug.cgi?id=3096
skong at zymeworks.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|Not Applicable |1.53
------- Comment #3 from skong at zymeworks.com 2010-08-13 18:23 EST -------
Hi Peter,
I manage to produce the problem without modifying _accept().
DIAGNOSTIC SCRIPT:
from Bio.PDB.PDBParser import PDBParser
from Bio.PDB.Polypeptide import PPBuilder, is_aa
def extract_peptides(model):
"""Extracts the peptides from a model.
Returns a list of Peptide object."""
output = []
for peptide in PPBuilder().build_peptides(model):
seq = str(peptide.get_sequence())
output.append(seq)
return output
if __name__ == '__main__':
pdb = open('chopped_pdb1bfe_noca.ent')
st = PDBParser().get_structure('', pdb)
seqa = extract_peptides(st)
print 'no ca seq all'
print seqa
PDB FILE: chopped_pdb1bfe_noca.ent
ATOM 85 N ILE A 316 37.386 71.217 31.070 1.00 36.97 N
ATOM 86 CA ILE A 316 38.311 71.290 29.949 1.00 33.71 C
ATOM 87 C ILE A 316 37.634 72.103 28.862 1.00 33.93 C
ATOM 88 O ILE A 316 36.415 72.216 28.839 1.00 36.46 O
ATOM 89 CB ILE A 316 38.651 69.876 29.404 1.00 35.79 C
ATOM 90 CG1 ILE A 316 39.331 69.049 30.501 1.00 36.78 C
ATOM 91 CG2 ILE A 316 39.572 69.979 28.187 1.00 37.71 C
ATOM 92 CD1 ILE A 316 39.881 67.724 30.023 1.00 39.20 C
ATOM 93 N HIS A 317 38.425 72.679 27.969 1.00 35.61 N
ATOM 94 CA HIS A 317 37.880 73.473 26.881 1.00 37.92 C
ATOM 95 C HIS A 317 38.360 72.928 25.540 1.00 37.79 C
ATOM 96 O HIS A 317 39.463 73.240 25.094 1.00 37.44 O
ATOM 97 CB HIS A 317 38.303 74.930 27.052 1.00 35.19 C
ATOM 98 CG HIS A 317 37.888 75.519 28.363 1.00 35.76 C
ATOM 99 ND1 HIS A 317 36.611 75.981 28.602 1.00 37.74 N
ATOM 100 CD2 HIS A 317 38.575 75.701 29.516 1.00 37.59 C
ATOM 101 CE1 HIS A 317 36.529 76.420 29.844 1.00 38.74 C
ATOM 102 NE2 HIS A 317 37.706 76.262 30.421 1.00 36.76 N
ATOM 103 N ARG A 318 37.527 72.109 24.905 1.00 38.78 N
ATOM 104 CA ARG A 318 37.884 71.512 23.627 1.00 42.04 C
ATOM 105 C ARG A 318 38.469 72.559 22.699 1.00 45.14 C
ATOM 106 O ARG A 318 39.592 72.425 22.205 1.00 42.05 O
ATOM 107 CB ARG A 318 36.657 70.880 22.967 1.00 42.93 C
ATOM 108 CG ARG A 318 36.934 70.321 21.576 1.00 38.60 C
ATOM 109 CD ARG A 318 35.654 70.038 20.821 1.00 35.39 C
ATOM 110 NE ARG A 318 34.624 69.538 21.724 1.00 34.96 N
ATOM 111 CZ ARG A 318 34.539 68.278 22.141 1.00 31.51 C
ATOM 112 NH1 ARG A 318 35.419 67.373 21.736 1.00 25.19 N
ATOM 113 NH2 ARG A 318 33.579 67.929 22.983 1.00 29.10 N
ATOM 114 N XLY A 319 37.690 73.604 22.461 1.00 49.96 N
ATOM 115 CX XLY A 319 38.138 74.668 21.592 1.00 55.53 C
ATOM 116 C XLY A 319 38.459 74.219 20.180 1.00 58.85 C
ATOM 117 O XLY A 319 37.583 73.766 19.440 1.00 58.98 O
ATOM 118 N SER A 320 39.734 74.334 19.823 1.00 61.64 N
ATOM 119 CA SER A 320 40.219 73.992 18.493 1.00 63.16 C
ATOM 120 C SER A 320 40.212 72.517 18.110 1.00 65.27 C
ATOM 121 O SER A 320 39.558 72.127 17.145 1.00 65.12 O
ATOM 122 CB SER A 320 41.634 74.542 18.316 1.00 65.36 C
ATOM 123 OG SER A 320 42.124 74.255 17.019 1.00 72.05 O
ATOM 124 N THR A 321 40.955 71.702 18.853 1.00 67.43 N
ATOM 125 CA THR A 321 41.049 70.274 18.562 1.00 67.73 C
ATOM 126 C THR A 321 40.220 69.430 19.529 1.00 66.41 C
ATOM 127 O THR A 321 39.244 69.917 20.095 1.00 70.21 O
ATOM 128 CB THR A 321 42.517 69.810 18.620 1.00 70.22 C
ATOM 129 OG1 THR A 321 42.613 68.453 18.169 1.00 77.03 O
ATOM 130 CG2 THR A 321 43.049 69.915 20.045 1.00 72.07 C
ATOM 131 N GLY A 322 40.608 68.168 19.707 1.00 61.22 N
ATOM 132 CA GLY A 322 39.892 67.286 20.614 1.00 53.23 C
ATOM 133 C GLY A 322 40.037 67.705 22.065 1.00 48.00 C
ATOM 134 O GLY A 322 40.138 68.892 22.372 1.00 50.41 O
ATOM 135 N LEU A 323 40.044 66.734 22.968 1.00 41.92 N
ATOM 136 CA LEU A 323 40.190 67.033 24.385 1.00 35.58 C
ATOM 137 C LEU A 323 41.613 66.738 24.874 1.00 31.41 C
ATOM 138 O LEU A 323 41.932 66.921 26.046 1.00 30.47 O
ATOM 139 CB LEU A 323 39.160 66.240 25.191 1.00 35.76 C
ATOM 140 CG LEU A 323 37.716 66.576 24.802 1.00 39.50 C
ATOM 141 CD1 LEU A 323 36.733 65.796 25.670 1.00 38.15 C
ATOM 142 CD2 LEU A 323 37.493 68.074 24.955 1.00 38.58 C
The output peptides should be: ['IHR',STGL'] not ['IHRXTGL'] in the current
version. Residue XLY A 319 or X in the fourth position should not be included
since it doesn't have CA atom. Instead the current version includes it and
remove the 'S' next to it, due to the same bug. One can get the right version
using the patch provided before.
Whether the _accept is modified or not the bug remains. Also the user should
not be expected to also modify build_peptides() method whenever PPBuilder
_accept is modified since the accept variable in build_peptides isn't really a
local (private) variable: In line 277 this variable accept is referenced from
self.accept of PPBuilder.
http://www.biopython.org/DIST/docs/api/Bio.PDB.Polypeptide-pysrc.html
277 accept=self._accept
On a side note the "aa_only" optional input variable for build_peptides() and
its comments are very misleading (@param aa_only: if 1, the residue needs to be
a standard AA). "aa_only" is meant as a flag that tells peptide_builder to
start filtering amino acids that are not to be accepted, and by default it is
turned on and without modifying _accept of PeptideBuilder only residues with
"CA" atom are accepted (line 250-264), not standard amino acids as the comment
states. In other words without modifying _accept in PeptideBuilder non standard
amino acid will still be accepted and included in the peptides built. Only when
overriding the _accept method of PeptideBuilder (as I did before) would
build_peptides() not include non-standard amino acids. I suggest renaming
"aa_only" to something more sensible like "filter_aa".
http://www.biopython.org/DIST/docs/api/Bio.PDB.Polypeptide-pysrc.html
266 - def build_peptides(self, entity, aa_only=1):
273 @param aa_only: if 1, the residue needs to be a standard AA
274 @type aa_only: int
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list