[Biopython-dev] sequence format readers ?

Brad Chapman chapmanb at arches.uga.edu
Wed Sep 12 04:50:15 EDT 2001


Hi Thomas;

> Brad, I made some changes to our initial SeqRecord and FastaReader/Write
> classes in order to use it for inheritance.

Cool! Thanks for working on this. With regards to SeqRecord, adding
__str__ stuff for debugging is great. Abstracting out the
common stuff in Reader/Writer is definitely a plus. I have to admit
to not having looked at or used the SeqIO stuff much, mostly because
I always figured it was a work-in-progress.

One thing that comes to mind is you might want to support the
Iterator stuff coming in python 2.2:

http://www.amk.ca/python/2.2/index.html#SECTION000300000000000000000

Seems like all we need to do is add __iter__ that returns the object
itself and we'll be all set (and it should be back compatible and
all of that).
          
> Before I start defining rules for the other formats we should 
> brainstorm over possible drawbacks/pitfalls of the current 
> implementation (e.g. alignments).

Hmm, I guess I just figured we would run into pitfalls after it was
already coded :-). Seriously, I'm pretty happy with the SeqRecord +
SeqFeature classes (with a few mistakes I made which I'll write
about in a separate thread in a second), so it might be best to go
forward and see how they handle what we need. Everything does a
decent job of supporting the BioCorba spec, which is a good sign (to
me!) that they can handle "most common cases."

In terms of alignments, I think these will end up being more "high
level" than SeqRecords. For instance, in the Generic alignment stuff
I coded up, an Alignment is basically a collection of SeqRecords. So
the conversions here will be a little different, I guess:

A File of FASTA records (lots of SeqRecords) --> one Alignment
one Alignment --> a bunch of FASTA records

Other than this, I think you're on target (at least with my
understanding of how conversions will work). If you can coerce
Andrew into commenting, he might have some opinions about how the
SeqIO stuff should work, since he wrote it.

May-the-force-by-with-you-on-sequence-conversions-ly yr's,
Brad



More information about the Biopython-dev mailing list