[Biopython] Issues parsing genbank files

Wed Oct 4 06:34:31 UTC 2017

Thanks to Jocelyne for pointing me in the right direction. The answer is to use the 'rettype="gbwithparts"' instead of just "gb". This will then retrieve the whole record.

It would be worth noting this in the Biopython example.

..d

Dr David Martin
Senior Lecturer in Bioinformatics
College of Life Sciences
University of Dundee

________________________________
From: Steve Bond <biologyguy at gmail.com>
Sent: 04 October 2017 01:03
To: biopython at lists.open-bio.org; David Martin (Staff)
Subject: Re: [Biopython] Issues parsing genbank files

Hi David,
This is definitely an issue on NCBI's side. For some reason, trying to pull the entire record is causing an error, but you can get the entire record minus one residue:

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=NC_003197.2&seq_start=1&seq_stop=4857449&rettype=gb&retmode=text

It's not restricted to your record either, it seems like anything large is causing the issue. Anyway, your work around is to use the seq_stop keyword and ask for one fewer residue than the length of the record.
Maybe you want to let the folks at Entrez know?
-Steve

The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20171004/5ea23be2/attachment.html>