[Biopython] Any CATHDB users (protein domain database)?
Peter Cock
p.j.a.cock at googlemail.com
Wed Jun 21 14:16:59 UTC 2017
Thanks Stéphane,
Any further comments on the Github pull request would be ideal.
https://github.com/biopython/biopython/pull/1258
Peter
On Wed, Jun 21, 2017 at 2:38 PM, Téletchéa Stéphane
<stephane.teletchea at univ-nantes.fr> wrote:
> Le 21/06/2017 à 15:13, Peter Cock a écrit :
>>
>> Thanks Stéphane,
>>
>> That would be much appreciated - and thank you for your patience Saket,
>>
>> Peter
>
>
> OK, so far it seems to work (small how-to below), here is my small test:
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> #!/bin/bash
>
> # http://mailman.open-bio.org/pipermail/biopython-dev/2017-May/021706.html
>
> git clone https://github.com/biopython/biopython.git
> cd biopython
> git fetch origin pull/1258/head
> git checkout -b pullrequest FETCH_HEAD
> sudo python setup.py install --prefix=/opt/biopython-dev
>
> export PYTHONPATH=/opt/biopython-dev
>
> python
> import sys
> sys.path.append('/opt/biopython-dev/lib/python2.7/site-packages/')
>
> # Simple example for Bcl-XL
> # Uniprot: http://www.uniprot.org/uniprot/Q07817
> # CATH: http://www.cathdb.info/version/v4_1_0/superfamily/1.10.437.10
> # PFAM: http://pfam.xfam.org/protein/Q07817
> # Inhibitors: https://en.wikipedia.org/wiki/Bcl-2#Targeted_therapies
> # Family: https://bcl2db.ibcp.fr/BCL2DB/
>
> from Bio.Seq import Seq
> my_seq =
> Seq("MSQSNRELVVDFLSYKLSQKGYSWSQFSDVEENRTEAPEGTESEMETPSAINGNPSWHLADSPAVNGATGHSSSLDAREVIPMAAVKQALREAGDEFELRYRRAFSDLTSQLHITPGTAYQSFEQVVNELFRDGVNWGRIVAFFSFGGALCVESVDKEMQVLVSRIAAWMATYLNDHLEPWIQENGGWDTFVELYGNNAAAESRKGQERFNRWFLTGMTVAGVVLLGSLFSRK")
>
> from Bio.cathdb import *
> q=search_by_sequence(my_seq)
> check_progress(q)
> {u'message': u'done', u'data': {u'status': u'done', u'date_started':
> u'2017-06-21T13:09:00', u'date_completed': u'2017-06-21T13:09:03',
> u'worker_hostname': u'mothra.biochem.ucl.ac.uk', u'id':
> u'50a6e3fc11a0f917023e43f8c86c2c75'}, u'success': 1}
> r=retrieve_results(q)
>
> print r['cath_version']
> 4.1.0
> print r['query_fasta']
>>QUERY
> MSQSNRELVVDFLSYKLSQKGYSWSQFSDVEENRTEAPEGTESEMETPSAINGNPSWHLADSPAVNGATGHSSSLDAREVIPMAAVKQALREAGDEFELRYRRAFSDLTSQLHITPGTAYQSFEQVVNELFRDGVNWGRIVAFFSFGGALCVESVDKEMQVLVSRIAAWMATYLNDHLEPWIQENGGWDTFVELYGNNAAAESRKGQERFNRWFLTGMTVAGVVLLGSLFSRK
>
>>>> print r['funfam_scan']
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> I can provide more input, is it better on the pull request?
>
>
> Best,
>
> Stéphane
>
> --
> Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein
> Design In Silico
> UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322 Nantes
> cedex 03, France
> Tél : +33 251 125 636 / Fax : +33 251 125 632
> http://www.ufip.univ-nantes.fr/ - http://www.steletch.org
More information about the Biopython
mailing list