[BioPython] Cannot parse ApE plasmid editor GenBank file
Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Tue Jun 5 15:24:05 UTC 2007
Peter wrote:
> Martin MOKREJŠ wrote:
>> Hi,
>> I am trying to parse a GenBank file created by ApE plasmid editor
>> (see Google for details) with biopython-1.43 and I get:
>
> ...
>
>> AssertionError: Did not recognise the LOCUS line layout:
>> LOCUS 6499 bp ds-DNA linear 02-AUG-2006
>>
>> Is the number of spaces wrong?
>
> Yes - fields don't line up with either of the GenBank variants Biopython
> expects. I suspect their files doesn't follow the current NCBI standard
> for the locus line...
>
> Could you make a set of different files (for different sequences) and
> check if the spacing changes or is preserved?
OK, two types of errors, the first case is caused by files generated by VectorNTI,
the second type of error is caused by ApE editor-produced files:
>>> fhandle = open('/mnt/smartmedia/utrophinA/p-cmvbGalCAT.gb','r')
>>> genbank_entry = parser.parse(fhandle)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 187, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 361, in feed
self._feed_header_lines(consumer, self.parse_header())
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 967, in _feed_header_lines
getattr(consumer, consumer_dict[line_type])(data)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 409, in source
if content[-1] == '.':
IndexError: string index out of range
>>>
>>> fhandle = open('/mnt/smartmedia/nrf/ok/PBCRLucPFLuc.gb','r')
>>> genbank_entry = parser.parse(fhandle)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 187, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 360, in feed
self._feed_first_line(consumer, self.line)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 835, in _feed_first_line
assert False, \
AssertionError: Did not recognise the LOCUS line layout:
LOCUS 6988 bp ds-DNA linear 20-DEC-2006
>>>
I would appreciate if you could tell me then what was exactly wrong with the generated
files by ApE editor (author Cc:ed).
Hope this helps,
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genbank-formatted-testcases.zip
Type: application/zip
Size: 32571 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20070605/d942bf78/attachment-0002.zip>
More information about the Biopython
mailing list