[Biopython] Problem parsing embl files
Jaime Tovar
jmtc21 at bath.ac.uk
Thu May 30 19:48:59 UTC 2013
Hi all,
Is the first time I try to parse embl files with biopython. I'm trying
to get the gene ids and coordinates for start/end of each gene.
I thought it will be straight forward like with other annotation files,
so I did a small script to test it.
from Bio import SeqIO
if __name__ == '__main__':
handle = open("sctg_0.embl", "r")
records = SeqIO.parse(handle, "embl")
for record in records :
print(record)
But when running the script I get an error which may suggest the embl
files have an issue
ValueError: Premature end of features table, marker '//' found
I checked the source code of the parser and seems the embl file has
problems, but when I checked embl file format seems they are ok. I have
a few thousand files formatted in the same way. So can't think about
other way to deal with the problem but to parse them.
The annotation files have only annotation info, no sequences. Here I
uploaded an example.
http://depositfiles.com/files/481uob95e
I'm using python 2.7.4 and biopython 1.61 on a win x64 computer.
Any advice and suggestion will be greatly appreciated.
Jaime.
More information about the Biopython
mailing list