[Biopython] BioPython MMCIFParser.py chain.id
Riccardo
mitma07 at gmail.com
Wed Jan 28 12:48:37 UTC 2015
I solved parsing the cif file to get the two chain IDs, saving them in two
different variables
*X3D PyMOL Molecule Viewer (WebGL-powered) <http://chembioscripting.hol.es>*
*ChemBioScripting | Gioacchino Riccardo Volpe*
2015-01-26 21:32 GMT+01:00 Riccardo <mitma07 at gmail.com>:
> Ok, reading the MMCIFParser.py file, I found out that "len(resseq_list)"
> is equivalent to "len(chain.get_list())".
>
> There is still to understand, using a CIF file, how to get the id for
> chains equal to the id used for PDB file, that is get "auth_asym_id"
> instead of "label_asym_id": is there a builtin option in BioPython?
>
> Thanks,
> Riccardo
>
>
> *X3D PyMOL Molecule Viewer (WebGL-powered)
> <http://chembioscripting.hol.es>*
>
> *ChemBioScripting | Gioacchino Riccardo Volpe*
>
> 2015-01-26 19:36 GMT+01:00 Riccardo <mitma07 at gmail.com>:
>
>> Hello to the BioPython mailing-list,
>> I'm using BioPython to calculate the dihedral angles in a protein
>> together with the total number of residues for each chain; I made use of
>> this construct for the total number of residues:
>>
>> * resseq_list = []*
>> * for residue in chain:*
>> * #print residue*
>> * residue_full_id = residue.get_full_id()*
>> * #print residue_full_id*
>> * resseq = residue_full_id[3][1]*
>> * #print resseq*
>> * resseq_list.extend([resseq])*
>> * #print resseq_list*
>> * print "\nThe first residue of chain %s is %s" % ( str(chain.id
>> <http://chain.id>), resseq_list[0] )*
>> * print "The last residue of chain %s is %s" % ( str(chain.id
>> <http://chain.id>), resseq_list[-1] )*
>> * print "The total number of residues into chain %s is %s\n" % (
>> str(chain.id <http://chain.id>), len(resseq_list) )*
>>
>> but the IDs for the chains differ from those shown, for example, in PyMOL.
>>
>> Trying to figure out the cause, and comparing a PDB file with a CIF for
>> the same macromolecule, I realized that the cause lies in the variables "
>> *_atom_site.label_asym_id*" and "*_atom_site.auth_asym_id*" of CIF file,
>> which correspond to columns [27:28] and [88:89] in the ATOM row of CIF file.
>>
>> Reading here <http://www.openstructure.org/docs/1.3/io/mmcif/>, and in
>> particular "*AddMMCifPDBChainTr (cif_chain_id, pdb_chain_id)*", I
>> thought that in practice the BioPython CIF parser considers "
>> *label_asym_id*" instead of "*auth_asym_id*". So I opened the file
>> *MMCIFParser.py*, and effectively I found, at line 37:
>>
>> * chain_id_list=mmcif_dict["_atom_site.label_asym_id"]*
>>
>> I tried to replace it with:
>>
>> * chain_id_list=mmcif_dict["_atom_site.auth_asym_id"]*
>>
>> and reloading my script, the output has been the same as the one reported
>> by PyMOL, for some test CIF files, but not for all.
>>
>> Is there an option, in BioPython, that enables the output directly in
>> that format? Eventually, it might be a good idea to implement it, as seen
>> in that web page <http://www.openstructure.org/docs/1.3/io/mmcif/>?
>> Is there also another better way to know the total number of residues for
>> each chain, such as in mine?
>>
>> Thanks a lot, and many greetings to the BioPython mailing-list: this is
>> my first time here!
>>
>> Riccardo Volpe
>>
>> *X3D PyMOL Molecule Viewer (WebGL-powered)
>> <http://chembioscripting.hol.es>*
>>
>> *ChemBioScripting | Gioacchino Riccardo Volpe*
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150128/78cc4fe7/attachment.html>
More information about the Biopython
mailing list