[Biopython-dev] Bio.GenBank.LocationParser chokes on misc_feature in Desulfurococcus kamchatkensis 1221n/NC_011766.gbk
Peter Cock
p.j.a.cock at googlemail.com
Mon Jul 11 09:38:03 UTC 2011
On Mon, Jul 11, 2011 at 9:34 AM, Tim te Beek <tim.te.beek at nbic.nl> wrote:
> When parsing ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Desulfurococcus_kamchatkensis_1221n_uid59133/NC_011766.gbk
> using SeqIO.read(genbank_file, 'genbank') I get the following
> stacktrace:
>
> ...
> gbk_records = (SeqIO.read(genbank_file, 'genbank') for
> genbank_file in genbank_files)
> ...
> Bio.GenBank.LocationParserError:
> order(1078481..1078483,join(1078778,1078800..1078810))
>
> The offending feature is:
> misc_feature complement(order(1078481..1078483,join(1078778,
> 1078800..1078810)))
> /locus_tag="DKAM_1147"
> /note="active site"
> /db_xref="CDD:73252"
>
> Could you look into whether this is a bug in the parser or in the input file?
>
That looks like the issue reported in Bug 3197, which turned out to be invalid
GenBank files: https://redmine.open-bio.org/issues/3197
Quoting from: http://www.ncbi.nlm.nih.gov/collab/FT/
>>
>> 3.4.2.2 Operators
>>
>> ...
>>
>> Note : location operator "complement" can be used in combination with
>> either "join" or "order" within the same location; combinations of "join"
>> and "order" within the same location (nested operators) are illegal.
Please report this problem with NC_011766.gbk and NC_009142.gbk to
the NCBI (could you CC me too?), try using gb-admin at ncbi.nlm.nih.gov
The next release of Biopython will have a clearer error message in this
situation.
Thank you,
Peter
More information about the Biopython-dev
mailing list