[Biopython-dev] Error in SeqFeature.CompoundLocation parsing NCBI efetch format
Peter Cock
p.j.a.cock at googlemail.com
Thu Dec 5 16:46:46 UTC 2013
On Thu, Dec 5, 2013 at 4:43 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Thu, Dec 5, 2013 at 4:29 PM, Brynjar Smári Bjarnason
> <binni at binnisb.com> wrote:
>>
>> Hello.
>>
>> I see CompoundLocation is quite new. I am currently using anaconda
>> (Python 2.7.6 :: Anaconda 1.8.0 (64-bit)) and BioPython 1.62.
>>
>> I am fetching gi values and using SeqIO to parse them. So far most of
>> them work but I found one that fail.
>>
>> Code:
>>
>> p = Entrez.efetch(db="protein", rettype="gp", retmode="text",id="494379")
>> seq = SeqIO.read(p,"gb")
>>
>> Gives error:
>> ValueError: CompoundLocation should have at least 2 parts
>>
>> With quite long stack trace and the last one being:
>>
>> /Bio/SeqFeature.pyc:
>> 996 if len(self.parts) < 2:
>> --> 997 raise ValueError("CompoundLocation should have at
>> least 2 parts")
>>
>> Any suggestions on how to fix this, and maybe what is different with
>> this gi from the rest of them (one gi that works: 10342)?
>>
>> Brynjar
>
> Hi Brynjar,
>
> Hmm. Right now the website is very slow & won't load
> http://www.ncbi.nlm.nih.gov/protein/494379
> and via Entrez I am getting a network error:
> urllib2.HTTPError: HTTP Error 502: Bad Gateway
>
> Where you able to save the file, and could you post it online
> (e.g. at http://gist.github.com)?
>
> Regards,
>
> Peter
Not to worry - the site did respond when I retried a bit later, and
I can reproduce the parser error:
>>> from Bio import SeqIO
>>> r = SeqIO.read("1MRR_A.gp", "genbank")
/Library/Python/2.7/site-packages/Bio/GenBank/__init__.py:1096:
BiopythonParserWarning: Couldn't parse feature location:
'join(bond(84),bond(115),bond(118),bond(238))'
% (location_line)))
/Library/Python/2.7/site-packages/Bio/GenBank/__init__.py:1096:
BiopythonParserWarning: Couldn't parse feature location:
'join(bond(115),bond(204),bond(238),bond(241))'
% (location_line)))
/Library/Python/2.7/site-packages/Bio/GenBank/__init__.py:1096:
BiopythonParserWarning: Couldn't parse feature location:
'join(bond(194),bond(272))'
% (location_line)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/Bio/SeqIO/__init__.py", line
646, in read
first = next(iterator)
File "/Library/Python/2.7/site-packages/Bio/SeqIO/__init__.py", line
582, in parse
for r in i:
File "/Library/Python/2.7/site-packages/Bio/GenBank/Scanner.py",
line 467, in parse_records
record = self.parse(handle, do_features)
File "/Library/Python/2.7/site-packages/Bio/GenBank/Scanner.py",
line 451, in parse
if self.feed(handle, consumer, do_features):
File "/Library/Python/2.7/site-packages/Bio/GenBank/Scanner.py",
line 423, in feed
self._feed_feature_table(consumer, self.parse_features(skip=False))
File "/Library/Python/2.7/site-packages/Bio/GenBank/Scanner.py",
line 374, in _feed_feature_table
consumer.location(location_string)
File "/Library/Python/2.7/site-packages/Bio/GenBank/__init__.py",
line 1083, in location
operator=location_line[:i])
File "/Library/Python/2.7/site-packages/Bio/SeqFeature.py", line
1003, in __init__
raise ValueError("CompoundLocation should have at least 2 parts")
ValueError: CompoundLocation should have at least 2 parts
Peter
More information about the Biopython-dev
mailing list