[Biopython] Understanding pdb biopython

João Rodrigues anaryin at gmail.com
Sun Oct 26 20:44:49 UTC 2014


Hi Sanjeev,

Check breaks. As I told you, iterate over the amino acids and for each
consecutive pair (e.g. residue 1 and 2), check the distance between the "C"
atom of 1 and the "N" atoms of 2. This is a very well defined distance
(peptide bond). Alternatively, and more simply, check CA-CA distances (e.g.
>4Å usually means gap).

Sometimes there is no chain identifier attributed to a particular chain..
 check those PDBs for the column 22 of ATOM records.

Cheers,

João





2014-10-26 11:31 GMT-05:00 Sanjeev Sariya <s.sariya_work at ymail.com>:

>
> Hi Joao,
> Thank you for response.
> If all residues aren't resolved in crystal, then extracting sequence from
> pdb, wouldn't be a good call.
>
> I will be working a lot [~100s or 1000s] in near future. Is there any way,
> I can find break in my pdb file?
>
> - Another doubt, I've, while printing the chain.ids in script. Many times,
> I get  chain " ", that is a space.
> In script sent, code looks like:
>
>         st=PDBParser(QUIET=True).get_structure('X',i)
>         ko=st.get_chains()
>         for i in ko:
>             print i.id
>
> Why space name is present?
>
> Thanks.
>
>   On Saturday, October 25, 2014 12:32 AM, João Rodrigues <
> anaryin at gmail.com> wrote:
>
>
> Hi there,
>
> The numbering in your PDB file is not continuous and it matches to regions
> in the structure that are missing residues. Open your PDB structure in
> Pymol and you'll see. Alternatively, print the C-N distances (peptide bond)
> for consecutive residues and you'll also notice when they are larger than
> ~3Å it corresponds to your break.
>
> As for your discrepancy between the sequences in the FASTA file and the
> PDB, that's just because not all residues are resolved in the crystal
> structure.
>
> Cheers,
>
> João
>
> 2014-10-24 13:10 GMT-05:00 Sanjeev Sariya <s.sariya_work at ymail.com>:
>
> Hi All,
> I'm having a hard time using and understanding biopython pdb.
> ./read_pdb_file.py 3OE6.pdb
>
> I'm attaching python script, pdb file, fasta file and output with mail.
> I'have following doubts:
> - When I print the sequence I get in broken pieces. Why?
> - Also the sequence printed doesn't match with the fasta file (attached).
> - Am I doing making a silly mistake?
>
> I am running script as:
> python read_pdb_file.py 3OE6.pdb
>
> Kindly help and guide.
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20141026/4eab1ee2/attachment.html>


More information about the Biopython mailing list