[Biopython] error in records

Wed Oct 20 13:49:20 UTC 2010

On Wed, Oct 20, 2010 at 2:19 PM, Liam Thompson <dejmail at gmail.com> wrote:
> hi everyone
>
> I am having problems seeing what is wrong with two genbank records of
> Hepatitis B Virus. When I cycle through a genbank file with multiple
> records, and these two are in it, it comes back with.
>
> Traceback (most recent call last):
>  File "/media/0844588592/phd/lab_book/bioinformatics/typeseq_cds_split.py",
> line 13, in <module>
>    for records in SeqIO.parse("cts.gb", "gb"):
>  ...
>  File "/usr/lib/pymodules/python2.6/Bio/Parsers/spark.py", line 203, in
> parse
>    self.error(tokens[i-1])
> IndexError: list index out of range
>
> I'm not sure what to make of this, especially as I've looked at the records
> for quite a while now and can't seem to figure out what peculiarity of the
> formatting upsets the parser.

Something about a feature location is causing the problem.

>From the traceback I infer that you are using an old version of Biopython
since this was rewritten in Biopython 1.55 (it doesn't use Spark by default).

> accessions X65259 and X85254. I would appreciate any tips or explanation
> of the above.

Looking at those two nucleotide GenBank records on the NCBI Entrez
website, I see nothing wrong or suspicious, and the current version of
Biopython reads them fine.

Could you try updating to Biopython 1.55? Alternatively if it isn't too
big you can email me your cgt.gb example file *off the mailing list* and
I'll try it here (in case there is something else wrong).

Regards,

Peter