[BioPython] EUtils-1.0 client
Andrew Dalke
dalke@dalkescientific.com
Sat, 11 Jan 2003 19:25:59 -0700
After two (annoying) weeks of development, I've released an
interface to NCBI's EUtils server. The following annoucement
comes from its home page, which is
http://www.dalkescientific.com/EUtils/
EUtils is a client-side library for the Entrez databases at NCBI.
NCBI provides the EUtils web service so that software can query Entrez
directly, rather than going through the web interface and dealing with
the hassles of web scraping.
This package provides two levels of interface. The lowest one makes a
programmatic interface to construct the query URL and make the request.
The higher level ones support history tracking and parsing of query
results. These greatly simplify working with the EUtils server.
EUtils is distributed under the Biopython License
To purchase commercial support or to hire us to develop customized tools
built using EUtils, contact info@dalkescientific.com.
Example: Get all protein sequences related to protein GI:4579714:
>>> import EUtils
>>> from EUtils import HistoryClient
>>> client = HistoryClient.HistoryClient()
>>> result = client.post(EUtils.DBIds("protein", "4579714"))
>>> related = result.neighbor_links("protein")
>>> related_dbids = related.linksetdbs["protein_protein"].dbids
>>> proteins = client.post(related_dbids)
>>> len(proteins)
223
>>> infile = proteins.efetch(retmode = "text", rettype = "fasta")
>>>
>>> fasta = infile.read()
>>> print fasta[:788]
>gi|27450749|gb|AAO14677.1|AF508258_1 rhodopsin [Pyrocystis lunula]
MAPIPDGFTYGQWSLVYNSLSFGIAGMGCATIFFWLQLPNVSKSYRTALTITGLVTAIATYHYVRIFNSW
VDAFKVVNVNGGDYTVTLLGAPFNDAYRYVDWLLTVPLLLIELILVMKLPKAETVKLSWNLGVASAVMVA
LGYPGEIQDDLLVRWFWWAMAMIPFYYVVVTLVNGLSDATAKQPDSVKSLVVTARYLTVISWLTYPGVYI
IKSMGLAGNIATTYEQVGYSVADVVAKAVFGVLIWAIAAGKSDEEEKNGLLG
>gi|6319528|ref|NP_009610.1| Homolog to HSP30 heat shock protein Yro1p;
>Yro2p [Saccharomyces cerevisiae]
MSDYVELLKRGGNEAIKINPPTGADFHITSRGSDWLFTVFCVNLLFGVILVPLMFRKPVKDRFVYYTAIA
PNLFMSIAYFTMASNLGWIPVRAKYNHVQTSTQKEHPGYRQIFYARYVGWFLAFPWPIIQMSLLGGTPLW
QIAFNVGMTEIFTVCWLIAACVHSTYKWGYYTIGIGAAIVVCISLMTTTFNLVKARGKDVSNVFITFMSV
IMFLWLIAYPTCFGITDGGNVLQPDSATIFYGIIDLLILSILPVLFMPLANYLGIERLGLIFDEEPAEHV
GPVAEKKMPSPASFKSSDSDSSIKEKLKLKKKHKKDKKKAKKAKKAKKAKKAQEEEEDVATDSE
>>>
Andrew Dalke
dalke@dalkescientific.com
--
Need usable, robust software for bioinformatics or chemical
informatics? Want to integrate your different tools so you can
do more science in less time? Contact us!
http://www.dalkescientific.com/