[Biopython-dev] [Bug 2910] Bio.PDB build_peptides sometimes gives shorter peptide sequences than expected

Thu Sep 24 11:14:37 UTC 2009

http://bugzilla.open-bio.org/show_bug.cgi?id=2910

------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk  2009-09-24 07:14 EST -------
(In reply to comment #4)
> Peter,
> 
> yes, indeed, I had a couple of problematic pdb ids. As soon as I find the time,
> I'll take a look at it and post them here. It's easy to do this. What I did is,
> I parsed the structures through the dssp structure assignment tool and compared
> the obtained sequence with that obtained from the Bio.PDB parser. Background: I
> wanted to map the sequence that dssp sees to atomic coordinates.
> 

If you can give us some more examples that would be very helpful, thank you.

I have committed a partial fix which means any known modified amino acids
(based on the presence of an alpha carbon) will be treated as an amino
acid for building the peptide (and given the default sequence letter of X).
This will also issue a warning. Any such previously unknown modified amino
acid (like PYX) needs to be added to our hard coded lookup table with the
appropriate single letter symbol as used by the PDF in their FASTA files
(in this case, PYX -> C for cysteine).

I suspect that some of your other problem PDB files still have (currently)
undefined modified amino acids in them...

Peter

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.