[BioPython] EMBL parsing in Biopython 1.43

Peter biopython at maubp.freeserve.co.uk
Sun Apr 29 20:02:05 UTC 2007


Michiel de Hoon wrote:
> Thanks Peter!
> 
> I tried this EMBL-formatted file (using the latest version of Biopython 
> in CVS):
> 
> ftp://ftp.pasteur.fr/pub/GenomeDB/SubtiList/FlatFiles/SLR16.1_embl.txt
> 
> but I got this error message:
> 
>  >>> from Bio import SeqIO
>  >>> input = open("SLR16.1_embl.txt")
>  >>> records = SeqIO.parse(input, format="embl")
>  >>> records.next()
> Traceback (most recent call last):
...
> "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/Bio/GenBank/Scanner.py", 
> line 540, in _feed_first_line
>      assert len(fields) == 7
> AssertionError
>  >>>

Does the same here on with CVS Biopython on Linux with python 2.4

> Do you have an idea as to what may be going wrong here?

Yes - I wrote and EMBL parser using the latest file format, while I 
suspect your file from the Pasteur Institute uses an older format - 
specifically one where the first list (the ID line) has a different 
number of fields.

This is reminiscent of the various revisions to the GenBank LOCUS line 
which we also have to cope with.

I hope to have a fix in CVS today/tomorrow.

Peter




More information about the Biopython mailing list