[Biopython] error in records

Peter biopython at maubp.freeserve.co.uk
Wed Oct 20 14:44:47 UTC 2010


On Wed, Oct 20, 2010 at 3:08 PM, Liam Thompson <dejmail at gmail.com> wrote:
> Hi Peter
>
> Thanks for looking at it. I upgraded to biopy 1.55, from 1.54 and it made no
> difference. There is still something funky going on. I have attached the
> records in the zip textfile, they are the last 2 listed in the file.
>
> Thanks
> Liam

Hi Liam,

I got the zipped GenBank file, thanks. The two problem records
have been changed - at least, they don't match what I download
from the NCBI today.

Running the example here the error message from X85254,

Bio.GenBank.LocationParserError: /join(1816..1899,1903..2454)

Hopefully you will agree that this is a much more helpful error
message than you had before. Looking at the file,

...
     gene            /join(1816..1899,1903..2454)
                     /gene="precore-core"
     CDS             /join(1816..1899,1903..2454)
                     /gene="precore-core"
                     /codon_start=1
...

You shouldn't have the leading slash on the join location
(two cases, gene and CDS entry too).

After fixing that by hand there is an error in X65259.1,

ValueError: Sequence line mal-formed, '       1 AACTCCACAA CCTTCCACCA
AACTCTGCAA GATCCCAGAG TGAGAGGCCT GTATTTCCCT'

You need another space at the start of that line.

With those three fixes (removing two slashes, adding one space)
then it seems to parse fine.

Peter



More information about the Biopython mailing list