[Biopython-dev] Reading sequences: FormatIO, SeqIO, etc

Albert Krewinkel krewink at inb.uni-luebeck.de
Thu Aug 17 07:25:34 UTC 2006

Peter wrote:
> Oh - you meant just adding EMBL feature iteration.  I want thinking 
> about the larger task of full EMBL file reading.

I started working on that, but I'm not very far yet.

> Doing just the features is very easy, here you go:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2059#c2

Wow, that was quick. And it's works allmost perfectly. One exception:
In _parse_embl_or_genbank_feature(), when parsing the location, it
shoudl say something like

from string import digits
while feature_location[-1] not in (')', digits):
    line = iterator.next()
    feature_location += line[FEATURE_QUALIFIER_INDENT:].strip()

This way, features may have multiline join(...) positions.

> Any more feedback is very welcome.  Are you using the iterators 
> directly, or via the helper function File2SequenceIterator?

I'm using iterators directly, out of old habits.  But most likely I
will finally get addicted to your nice helperfunction.

> Are you using just the sequence iterators, or the dictionary and list 
> versions too?

I don't used those yet.


Albert Krewinkel <krewink at inb.uni-luebeck.de>
University of Luebeck, Institute for Neuro- and Bioinformatics

More information about the Biopython-dev mailing list