[BioPython] embl

Brad Chapman chapmanb at uga.edu
Wed Mar 17 20:18:53 EST 2004


Hi Antonio;

> I'm new (and quite confused) to biopython.
> I have a simple question (maybe it looks silly):
> how do I parse an embl data file using biopython?
> Is there any way to retrieve the sequence information (The CDS section)?
> What about the position of the CDS sections (they are split in sub pieces)?

EMBL support is still lacking in Biopython. Currently we do have the
basis for developing a EMBL parser -- there is a Martel (the
underyling parsing system in Biopython) grammar for embl. This is
located in Bio/expressions/embl/embl65.py.

We still do need someone to help do the work to build this grammar
into a "Biopython-style" parser.

As a workaround, the GenBank parser in Biopython is quite functional
and widely used -- so you could fetch your sequences in GenBank
format and parse out the features from there, as described in the
documentation:

http://biopython.org/docs/tutorial/Tutorial004.html#toc13

Hope this helps!
Brad


More information about the BioPython mailing list