[Biopython] The problem of using Bio.SwissProt
De-Chang Yang
yangdc at mail.cbi.pku.edu.cn
Tue Sep 10 15:21:23 UTC 2019
Dear Biopython team,
Hi, this is Dechang Yang.
I want to search some information from swissProt databases by using BioPython. Then i find the tutorial at http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc139.
But to my surprise, i find the KeyWList module of Bio.SwissProt seems to be out of date......
When i type: help(KeyWList)
I get the class infomation:
| --------- --------------------------- ----------------------
| Line code Content Occurrence in an entry
| --------- --------------------------- ----------------------
| ID Identifier (keyword) Once; starts a keyword entry
| IC Identifier (category) Once; starts a category entry
| AC Accession (KW-xxxx) Once
| DE Definition Once or more
| SY Synonyms Optional; once or more
| GO Gene ontology (GO) mapping Optional; once or more
| HI Hierarchy Optional; once or more
| WW Relevant WWW site Optional; once or more
| CA Category Once per keyword entry; absent
| in category entries
You can see the Line Code include some KEYS, but i have to say those KEYS are inconsistent with the lastest swissProt KeyWList file. Which are like the content below:(DR CC RX and most of the lines will be ignored by the KeyWList module)
RP TISSUE SPECIFICITY, AND SUBCELLULAR LOCATION.
RX PubMed=24154973; DOI=10.1002/ijc.28557;
RA Peltekova V.D., Lemire M., Qazi A.M., Zaidi S.H., Trinh Q.M.,
RA Bielecki R., Rogers M., Hodgson L., Wang M., D'Souza D.J., Zandi S.,
RA Chong T., Kwan J.Y., Kozak K., De Borja R., Timms L., Rangrej J.,
RA Volar M., Chan-Seng-Yue M., Beck T., Ash C., Lee S., Wang J.,
RA Boutros P.C., Stein L.D., Dick J.E., Gryfe R., McPherson J.D.,
RA Zanke B.W., Pollett A., Gallinger S., Hudson T.J.;
RT "Identification of genes expressed by immune cells of the colon that
RT are regulated by colorectal cancer-associated variants.";
RL Int. J. Cancer 134:2330-2341(2014).
CC -!- SUBCELLULAR LOCATION: Membrane {ECO:0000269|PubMed:24154973};
CC Single-pass membrane protein {ECO:0000269|PubMed:24154973}.
CC Note=Co-localizes with crystalloid granules of eosinophils and
CC granular organelles of mast cells, neutrophils, macrophages and
CC dendritic cells.
CC -!- TISSUE SPECIFICITY: Expressed in gastrointestinal and immune
CC tissue, as well as prostate, testis and ovary. Expressed in lamina
CC propria and eosinophils but not in epithelial cells. Expression is
CC greater in benign adjacent tissues than in colon tumors.
CC {ECO:0000269|PubMed:24154973}.
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution (CC BY 4.0) License
CC -----------------------------------------------------------------------
DR EMBL; AK127703; -; NOT_ANNOTATED_CDS; mRNA.
DR EMBL; AP002448; -; NOT_ANNOTATED_CDS; Genomic_DNA.
DR RefSeq; NP_001289573.1; NM_001302644.1.
DR RefSeq; NP_001289574.1; NM_001302645.1.
DR RefSeq; NP_001289575.1; NM_001302646.1.
DR RefSeq; NP_001289576.1; NM_001302647.1.
DR RefSeq; NP_001289577.1; NM_001302648.1.
DR RefSeq; NP_997312.1; NM_207429.3.
DR BioMuta; HGNC:33789; -.
DR DMDM; 74711342; -.
DR PaxDb; Q6ZS62; -.
DR PRIDE; Q6ZS62; -.
DR ProteomicsDB; 68193; -.
DR GeneID; 399948; -.
DR KEGG; hsa:399948; -.
DR CTD; 399948; -.
DR DisGeNET; 399948; -.
DR GeneCards; COLCA1; -.
DR HGNC; HGNC:33789; COLCA1.
DR MIM; 615693; gene.
DR neXtProt; NX_Q6ZS62; -.
DR PharmGKB; PA164716768; -.
DR eggNOG; ENOG410JDIH; Eukaryota.
DR eggNOG; ENOG4111630; LUCA.
DR HOGENOM; HOG000111748; -.
DR InParanoid; Q6ZS62; -.
DR OrthoDB; 1566774at2759; -.
DR PhylomeDB; Q6ZS62; -.
DR TreeFam; TF354066; -.
DR ChiTaRS; COLCA1; human.
DR GenomeRNAi; 399948; -.
DR PRO; PR:Q6ZS62; -.
DR Proteomes; UP000005640; Unplaced.
DR GO; GO:0016021; C:integral component of membrane; IEA:UniProtKB-KW.
DR GO; GO:0016020; C:membrane; IDA:UniProtKB.
PE 2: Evidence at transcript level;
KW Complete proteome; Membrane; Reference proteome; Transmembrane;
KW Transmembrane helix.
FT CHAIN 1 124 Colorectal cancer-associated protein 1.
FT /FTId=PRO_0000340692.
Could you please help me to find if there are any mistakes i have made?
Best Regards,
Dechang
More information about the Biopython
mailing list