[BioPython] EMBL parsing in Biopython 1.43
Peter
biopython at maubp.freeserve.co.uk
Sun Apr 29 20:02:05 UTC 2007
Michiel de Hoon wrote:
> Thanks Peter!
>
> I tried this EMBL-formatted file (using the latest version of Biopython
> in CVS):
>
> ftp://ftp.pasteur.fr/pub/GenomeDB/SubtiList/FlatFiles/SLR16.1_embl.txt
>
> but I got this error message:
>
> >>> from Bio import SeqIO
> >>> input = open("SLR16.1_embl.txt")
> >>> records = SeqIO.parse(input, format="embl")
> >>> records.next()
> Traceback (most recent call last):
...
> "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/Bio/GenBank/Scanner.py",
> line 540, in _feed_first_line
> assert len(fields) == 7
> AssertionError
> >>>
Does the same here on with CVS Biopython on Linux with python 2.4
> Do you have an idea as to what may be going wrong here?
Yes - I wrote and EMBL parser using the latest file format, while I
suspect your file from the Pasteur Institute uses an older format -
specifically one where the first list (the ID line) has a different
number of fields.
This is reminiscent of the various revisions to the GenBank LOCUS line
which we also have to cope with.
I hope to have a fix in CVS today/tomorrow.
Peter
More information about the Biopython
mailing list