[BioPython] Cannot parse/convert embl formatted files
Martin MOKREJŠ
mmokrejs at ribosome.natur.cuni.cz
Wed Aug 9 11:22:51 UTC 2006
Hi,
I am following the manual at
http://biopython.org/DIST/docs/cookbook/genbank_to_fasta.html
to convert EMBL-formatted file to Genbank and I see that in
the beginning of the document after the line:
from Bio import formats
should be one more line
from Bio.FormatIO import FormatIO
Still, conversion from embl format does not work:
#!/usr/bin/python
input_handle = open('wgs_baad_pro.dat') # from ftp://ftp.embl.de/pub/databases/embl/release/
output_handle = open('wgs_baad_pro.fa', "w")
from Bio import formats
from Bio.FormatIO import FormatIO
formatter = FormatIO("SeqRecord", formats["embl"], formats["fasta"])
formatter.convert(input_handle, output_handle)
Traceback (most recent call last):
File "convertembl.py", line 8, in ?
formatter.convert(input_handle, output_handle)
File "/usr/lib/python2.4/site-packages/Bio/FormatIO.py", line 146, in convert
raise TypeError("Could not not determine file type")
TypeError: Could not not determine file type
It seems this is already known since
http://lists.open-bio.org/pipermail/biopython-dev/2006-April/002343.html
I use biopython-1.42 on linux so was there no fix included in teh release?
In principle, I do need to convert the file, what I really need is
a parser from EMBL formatted data from
ftp://bighost.ba.itb.cnr.it/pub/Embnet/Database/UTR/data/
to parse out record with some feature. As I do not see an EMBL parser
in the Bio package I believe it is not available, right?
It seems there is a parser for EMBL format also outside biopython:
http://www.embl-heidelberg.de/~chenna/PySAT/
has anybody used that?
Thanks for help,
martin
--
Dr. Martin Mokrejs
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
More information about the Biopython
mailing list