[Biopython] Issues parsing genbank files

David Martin (Staff) d.m.a.martin at dundee.ac.uk
Wed Oct 4 06:34:31 UTC 2017


Thanks to Jocelyne for pointing me in the right direction. The answer is to use the 'rettype="gbwithparts"' instead of just "gb". This will then retrieve the whole record.


It would be worth noting this in the Biopython example.


..d


Dr David Martin
Senior Lecturer in Bioinformatics
College of Life Sciences
University of Dundee



________________________________
From: Steve Bond <biologyguy at gmail.com>
Sent: 04 October 2017 01:03
To: biopython at lists.open-bio.org; David Martin (Staff)
Subject: Re: [Biopython] Issues parsing genbank files

Hi David,
This is definitely an issue on NCBI's side. For some reason, trying to pull the entire record is causing an error, but you can get the entire record minus one residue:

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=NC_003197.2&seq_start=1&seq_stop=4857449&rettype=gb&retmode=text

It's not restricted to your record either, it seems like anything large is causing the issue. Anyway, your work around is to use the seq_stop keyword and ask for one fewer residue than the length of the record.
Maybe you want to let the folks at Entrez know?
-Steve


The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20171004/5ea23be2/attachment.html>


More information about the Biopython mailing list