[Biopython] How to efetch Unigene records? Is it possible at all?

Carlos Javier Borroto carlos.borroto at gmail.com
Thu Jul 30 17:18:56 UTC 2009


Hi, I'm very new to Biopython and to Python in general, has a little
knowledge of Perl and some previous work with Bioperl.

I have the task to from a list of human genes of interest, grab their
protein counter parts in the database to do some additional work. In
the beginning I was thinking that using Bio.Entrez module and
Bio.SeqIO parser I could get the proteins counter parts, but I haven't
found a way to do it, oddly I haven't found a way to get the
crossreference through the parser even when I can see the genebank
files have always one.

Any way because I also have the Unigene ID list, and it seems that the
Unigene parser have a way to get the crossreference, I now want to
download all of the Unigene records and parse from there. But efetch
is not working with unigene, I mean this is not working:

>>> from Bio import Entrez
>>> from Bio import UniGene
>>> Entrez.email = "carlos.borroto at gmail.com"
>>> handle = Entrez.esearch(db="unigene", term="Hs.94542")
>>> record = Entrez.read(handle)
>>> record
{u'Count': '1', u'RetMax': '1', u'IdList': ['141673'],
u'TranslationStack': [{u'Count': '1', u'Field': 'All Fields', u'Term':
'Hs.94542[All Fields]', u'Explode': 'Y'}, 'GROUP'], u'TranslationSet':
[], u'RetStart': '0', u'QueryTranslation': 'Hs.94542[All Fields]'}
>>> handle = Entrez.efetch(db="unigene", id="Hs.94542")
>>> print handle.read()

This print like a webpage, I assume is NCBI server giving an error response.

So there is something I could do to accomplish what I want, either
through parsing the Genebank files or fetching the Unigene and then
parsing its?

Any help or pointing to some helpful documentation will be highly appreciated.
Thanks in advance
-- 
Carlos Javier



More information about the Biopython mailing list