[BioPython] Cannot parse ApE plasmid editor GenBank file
Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Wed Jun 27 15:22:53 UTC 2007
Hi Peter,
Peter wrote:
> Martin MOKREJŠ wrote:
>> OK, I have found the spacing problem with my LOCUS lines still to
>> persist,
>> and after some scripting I got the lines fixed.
>
> Excellent. I've been away for a few days and haven't had a chance to
> look at this yet.
thanks! No problem, I was busy as well. ;-)
>
>> The file starts with:
>>
>> LOCUS pBL-RLuc-GBB+3-III 5391 bp ds-DNA circular SYN
>> 14-JUN-2007
>> DEFINITION .
>> ACCESSION .
>> VERSION .
>> SOURCE .
>> ORGANISM .
>> COMMENT COMMENT ApEinfo:methylated:0
>> FEATURES Location/Qualifiers
>>
>
> The ORGANISM line looks wrong (three leading spaces rather than two, so
> the dot is pushed one column to the right).
>
> There is a blank COMMENT line which is also odd.
>
> Some of this may just be an email formatting issue, but I would expect
> this instead:
>
> ...
> DEFINITION .
> ACCESSION .
> VERSION .
> SOURCE .
> ORGANISM .
> COMMENT ApEinfo:methylated:0
> FEATURES Location/Qualifiers
> ...
OK, I have removed the COMMENT lines altogether and have fixed the ORGANISM
line. Still, I get:
python generate_image_from_genbank.py
Traceback (most recent call last):
File "generate_image_from_genbank.py", line 7, in ?
genbank_entry = parser.parse(fhandle)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 187, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 361, in feed
self._feed_header_lines(consumer, self.parse_header())
File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line 978, in _feed_header_lines
consumer.taxonomy(data.strip())
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 419, in taxonomy
self.data.annotations['taxonomy'] = self._split_taxonomy(content)
File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 250, in _split_taxonomy
if taxonomy_string[-1] == '.':
IndexError: string index out of range
LOCUS pBL-RLuc-GBB+3-III 5391 bp ds-DNA circular SYN 14-JUN-2007
DEFINITION .
ACCESSION .
VERSION .
SOURCE .
ORGANISM .
Thanks for your help,
M.
More information about the Biopython
mailing list