[Biopython] BioPython MMCIFParser.py chain.id

Riccardo mitma07 at gmail.com
Mon Jan 26 20:32:06 UTC 2015


Ok, reading the MMCIFParser.py file, I found out that "len(resseq_list)" is
equivalent to "len(chain.get_list())".

There is still to understand, using a CIF file, how to get the id for
chains equal to the id used for PDB file, that is get "auth_asym_id"
instead of "label_asym_id": is there a builtin option in BioPython?

Thanks,
Riccardo


*X3D PyMOL Molecule Viewer (WebGL-powered) <http://chembioscripting.hol.es>*

*ChemBioScripting | Gioacchino Riccardo Volpe*

2015-01-26 19:36 GMT+01:00 Riccardo <mitma07 at gmail.com>:

> Hello to the BioPython mailing-list,
> I'm using BioPython to calculate the dihedral angles in a protein together
> with the total number of residues for each chain; I made use of this
> construct for the total number of residues:
>
> *    resseq_list = []*
> *    for residue in chain:*
> *        #print residue*
> *        residue_full_id = residue.get_full_id()*
> *        #print residue_full_id*
> *        resseq = residue_full_id[3][1]*
> *        #print resseq*
> *        resseq_list.extend([resseq])*
> *    #print resseq_list*
> *    print "\nThe first residue of chain %s is %s" % ( str(chain.id
> <http://chain.id>), resseq_list[0] )*
> *    print "The last residue of chain %s is %s" % ( str(chain.id
> <http://chain.id>), resseq_list[-1] )*
> *    print "The total number of residues into chain %s is %s\n" % (
> str(chain.id <http://chain.id>), len(resseq_list) )*
>
> but the IDs for the chains differ from those shown, for example, in PyMOL.
>
> Trying to figure out the cause, and comparing a PDB file with a CIF for
> the same macromolecule, I realized that the cause lies in the variables "
> *_atom_site.label_asym_id*" and "*_atom_site.auth_asym_id*" of CIF file,
> which correspond to columns [27:28] and [88:89] in the ATOM row of CIF file.
>
> Reading here <http://www.openstructure.org/docs/1.3/io/mmcif/>, and in
> particular "*AddMMCifPDBChainTr (cif_chain_id, pdb_chain_id)*", I thought
> that in practice the BioPython CIF parser considers "*label_asym_id*"
> instead of "*auth_asym_id*". So I opened the file *MMCIFParser.py*, and
> effectively I found, at line 37:
>
> *    chain_id_list=mmcif_dict["_atom_site.label_asym_id"]*
>
> I tried to replace it with:
>
> *    chain_id_list=mmcif_dict["_atom_site.auth_asym_id"]*
>
> and reloading my script, the output has been the same as the one reported
> by PyMOL, for some test CIF files, but not for all.
>
> Is there an option, in BioPython, that enables the output directly in that
> format? Eventually, it might be a good idea to implement it, as seen in that
> web page <http://www.openstructure.org/docs/1.3/io/mmcif/>?
> Is there also another better way to know the total number of residues for
> each chain, such as in mine?
>
> Thanks a lot, and many greetings to the BioPython mailing-list: this is my
> first time here!
>
> Riccardo Volpe
>
> *X3D PyMOL Molecule Viewer (WebGL-powered)
> <http://chembioscripting.hol.es>*
>
> *ChemBioScripting | Gioacchino Riccardo Volpe*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150126/477c1c46/attachment.html>


More information about the Biopython mailing list