[Biopython] Errors with retrieval from Entrez
David Martin (Staff)
d.m.a.martin at dundee.ac.uk
Wed Nov 2 10:14:33 UTC 2016
Many thanks. I’ll try that for my production code. As it is for student novices just getting in to biopython I’ll restructure the assignment to read a genbank file they have downloaded instead of retrieving the entry from the DB.
..d
From: Leighton Pritchard [mailto:Leighton.Pritchard at hutton.ac.uk]
Sent: 02 November 2016 09:29
To: David Martin (Staff) <d.m.a.martin at dundee.ac.uk>
Cc: biopython at lists.open-bio.org <biopython at mailman.open-bio.org>
Subject: Re: [Biopython] Errors with retrieval from Entrez
Hi David,
NCBI/Entrez queries are prone to this - I usually wrap the request in a structure that allows retries in case of errors, e.g. something like
```
# Report last exception as string
def last_exception():
""" Returns last exception as a string, or use in logging."""
exc_type, exc_value, exc_traceback = sys.exc_info()
return ''.join(traceback.format_exception(exc_type, exc_value,
exc_traceback))
# Retry Entrez requests (or any other function)
def entrez_retry(fn, logger, *fnargs, **fnkwargs):
"""Retries the passed function up to the number of times specified
by args.retries
"""
tries, success = 0, False
while not success and tries < args.retries:
try:
output = fn(*fnargs, **fnkwargs)
success = True
except:
tries += 1
logger.warning("Entrez query %s(%s, %s) failed (%d/%d)",
fn, fnargs, fnkwargs, tries+1, args.retries)
logger.warning(last_exception())
if not success:
logger.error("Too many Entrez failures (exiting)")
sys.exit(1)
return output
```
The except: clause would be better if it caught IncompleteRead…
Cheers,
L.
On 2 Nov 2016, at 08:57, David Martin (Staff) <d.m.a.martin at dundee.ac.uk<mailto:d.m.a.martin at dundee.ac.uk>> wrote:
I’m trying to retrieve a sequence from NCBI and am getting incomplete reads. Any hints or tips?
..d
handle= Entrez.efetch(db="nuccore", id=gi, retmode='text', rettype='gb')
seq= handle.read()
Traceback (most recent call last):
File "<ipython-input-11-f879be0f80fe>", line 2, in <module>
seq= handle.read()
File "C:\Anaconda3\lib\http\client.py", line 440, in read
return self._readall_chunked()
File "C:\Anaconda3\lib\http\client.py", line 550, in _readall_chunked
raise IncompleteRead(b''.join(value))
IncompleteRead: IncompleteRead(6841762 bytes read)
The University of Dundee is a registered Scottish Charity, No: SC015096 _______________________________________________
Biopython mailing list - Biopython at mailman.open-bio.org<mailto:Biopython at mailman.open-bio.org>
http://mailman.open-bio.org/mailman/listinfo/biopython
--
Dr Leighton Pritchard
Information and Computing Sciences Group; Weeds, Pests and Diseases Theme
DG31, James Hutton Institute (Dundee)
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e: leighton.pritchard at hutton.ac.uk<mailto:leighton.pritchard at hutton.ac.uk> w: http://www.hutton.ac.uk/staff/leighton-pritchard
gpg/pgp: 0xFEFC205C tel: +44(0)844 928 5428 x8827 or +44(0)1382 568827
This email is from the James Hutton Institute, however the views expressed by the sender are not necessarily the views of the James Hutton Institute and its subsidiaries. This email and any attachments are confidential and are intended solely for the use of the recipient(s) to whom they are addressed.
If you are not the intended recipient, you should not read, copy, disclose or rely on any information contained in this email, and we would ask you to contact the sender immediately and delete the email from your system. Although the James Hutton Institute has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and any attachments.
The James Hutton Institute is a Scottish charitable company limited by guarantee.
Registered in Scotland No. SC374831
Registered Office: The James Hutton Institute, Invergowrie Dundee DD2 5DA.
Charity No. SC041796
The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20161102/de6540e8/attachment-0001.html>
More information about the Biopython
mailing list