[Biopython] Errors with retrieval from Entrez

Leighton Pritchard Leighton.Pritchard at hutton.ac.uk
Wed Nov 2 09:28:39 UTC 2016


Hi David,

NCBI/Entrez queries are prone to this - I usually wrap the request in a structure that allows retries in case of errors, e.g. something like

```
# Report last exception as string
def last_exception():
    """ Returns last exception as a string, or use in logging."""
    exc_type, exc_value, exc_traceback = sys.exc_info()
    return ''.join(traceback.format_exception(exc_type, exc_value,
                                              exc_traceback))


# Retry Entrez requests (or any other function)
def entrez_retry(fn, logger, *fnargs, **fnkwargs):
    """Retries the passed function up to the number of times specified
    by args.retries
    """
    tries, success = 0, False
    while not success and tries < args.retries:
        try:
            output = fn(*fnargs, **fnkwargs)
            success = True
        except:
            tries += 1
            logger.warning("Entrez query %s(%s, %s) failed (%d/%d)",
                           fn, fnargs, fnkwargs, tries+1, args.retries)
            logger.warning(last_exception())
    if not success:
        logger.error("Too many Entrez failures (exiting)")
        sys.exit(1)
    return output
```

The except: clause would be better if it caught IncompleteRead…

Cheers,

L.

On 2 Nov 2016, at 08:57, David Martin (Staff) <d.m.a.martin at dundee.ac.uk<mailto:d.m.a.martin at dundee.ac.uk>> wrote:

I’m trying to retrieve a sequence from NCBI and am getting incomplete reads. Any hints or tips?

..d


handle= Entrez.efetch(db="nuccore", id=gi, retmode='text', rettype='gb')
seq= handle.read()

Traceback (most recent call last):

  File "<ipython-input-11-f879be0f80fe>", line 2, in <module>
    seq= handle.read()

  File "C:\Anaconda3\lib\http\client.py", line 440, in read
    return self._readall_chunked()

  File "C:\Anaconda3\lib\http\client.py", line 550, in _readall_chunked
    raise IncompleteRead(b''.join(value))

IncompleteRead: IncompleteRead(6841762 bytes read)

The University of Dundee is a registered Scottish Charity, No: SC015096 _______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org<mailto:Biopython at mailman.open-bio.org>
http://mailman.open-bio.org/mailman/listinfo/biopython

--
Dr Leighton Pritchard
Information and Computing Sciences Group; Weeds, Pests and Diseases Theme
DG31, James Hutton Institute (Dundee)
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e: leighton.pritchard at hutton.ac.uk<mailto:leighton.pritchard at hutton.ac.uk>       w: http://www.hutton.ac.uk/staff/leighton-pritchard
gpg/pgp: 0xFEFC205C tel: +44(0)844 928 5428 x8827 or +44(0)1382 568827


If you are not the intended recipient, you should not read, copy, disclose or rely on any information contained in this email, and we would ask you to contact the sender immediately and delete the email from your system.  
Although the James Hutton Institute has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your 
responsibility to scan the email and any attachments.

The James Hutton Institute is a Scottish charitable company limited by guarantee.
Registered in Scotland No. SC374831
Registered Office: The James Hutton Institute, Invergowrie Dundee DD2 5DA.
Charity No. SC041796
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20161102/b1cefe8c/attachment.html>


More information about the Biopython mailing list