[Biopython-dev] ExPASy / UniProt API for searching and fetching

Peter Cock p.j.a.cock at googlemail.com
Mon Nov 4 11:04:03 UTC 2013


On Mon, Nov 4, 2013 at 10:48 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> Hi Peter, everyone,
>
> On Mon, Nov 4, 2013 at 11:35 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Mon, Nov 4, 2013 at 12:12 AM, Wibowo Arindrarto
>> <w.arindrarto at gmail.com> wrote:
>>> Hello everyone,
>>>
>>> I'm trying a jab at this in this branch:
>>> https://github.com/bow/biopython/blob/dev_uniprot-api/Bio/UniProt/API.py.
>>>
>>> It's still quite bare bones at the moment, but the general approach
>>> shouldn't change much for the other functions. I'm still not sure
>>> about the namespace, though. Should it be Bio.UniProt.API,
>>> Bio.UniProt.api, or something else?
>>>
>>> As always, let me know what you think :).
>>>
>>> Cheers,
>>> Bow
>>
>> Rather than Bio.UniProt.API, we should try to use lowercase
>> PEP8 module names (I think here .api would be better).
>>
>> However, API seems to vague - how about www to be more
>> clear this is an online resource?
>
> Bio.UniProt.www is fine with me, too (and somewhat in line with what
> we already have in Bio.Blast.NCBIWWW).
>
>> (Unlike the NCBI Entrez Utilities, there isn't a clear branding
>> for the UniProt web API is there?)
>
> It's not as comprehensively branded as Entrez. They do have similar
> functionalities, though. We can do query searches, retrieve records,
> do a BLAST search, and even map identifiers across databases. The
> BLAST search API doesn't seem to be very well-documented in the
> website, but there is a Java API that allows you to do so:
> http://www.ebi.ac.uk/uniprot/remotingAPI/.

See:
http://www.ebi.ac.uk/Tools/webservices/services/sss/ncbi_blast_rest

> They even have their own
> formats of the BLAST results (e.g.
> http://www.uniprot.org/blast/uniprot/2013110441MHJHFYCR.* or
> http://www.uniprot.org/blast/uniprot/2013110441MHJHFYCR.xml)

Hmm, another XML variant. Maybe ignore that for now.

> There are still some things left in Bio.ExPASy that UniProt doesn't
> seem to cover, though, such as the Prosite search.

Anyone know?

> I was thinking maybe we could reorganize the existing Bio.UniProt code
> base like this:
>
> Bio.UniProt
>   |
>   |-- parsers
>   |       |
>   |       |-- GOA, etc.
>   |-- www (or api, or remote)
>          |-- __init__.py containing main functions
>

Bio.UniProt.GOA was only introduced in the last release, moving
it under Bio.UniProt.parsers.GOA doesn't seem ideal. CC'ing Iddo
for comment.

Peter



More information about the Biopython-dev mailing list