[Biopython-dev] New Bio.SeqIO code

Michiel de Hoon mdehoon at c2b2.columbia.edu
Sat Oct 28 05:56:51 UTC 2006


Thanks, Peter!
It looks very nice. Actually, I have been using an earlier version of 
the new SeqIO module (from your code on Bugzilla) and found it to work 
quite well. A few short comments:

To parse a Fasta file using the new SeqIO looks like this:

from Bio.SeqIO import File2SequenceIterator
for record in File2SequenceIterator("example.fasta") :
     print record.id
     print record.seq

I would rather have something like this:

from Bio.SeqIO import Fasta
for record in Fasta.parse(open("example.fasta")):
     print record.id
     print record.seq

where Fasta.parse returns a FastaIterator object, and the argument is 
either a file object or a file name. You can in addition have a function 
Bio.SeqIO.parse that guesses the file type from the file name extension 
(as you have now for File2SequenceIterator), though that wouldn't work 
for file handles.

On a related note, I don't think we need the SequenceList and 
SequenceDict class. To make a list, one can do

from Bio.SeqIO import Fasta
records = [record for record in Fasta.parse(open("example.fasta"))]

To convert an iterator to a dictionary takes one line more, and is 
probably more straightforward than SequenceDict.

--Michiel.



Peter (BioPython Dev) wrote:
> Hello list,
> 
> I've checked in a somewhat cleaned up (and more tested) version of the
> earlier attachments to bug 2059.
> 
> And I've updated the wiki page:
> http://biopython.org/wiki/SeqIO
> 
> Has anyone got any tips on formatting python code on Wiki?  Maybe I
> should just write the docs in LaTeX like the cook book etc.
> 
> Can I check in bug 2057 too?  Given the SeqIO system produces SeqRecord
> objects, it would be a good idea to make them slightly more user-friendly:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2057
> 
> (I would like to check this in before writing to much of the SeqIO
> documentation)
> 
> If any of you want to check this out and have a look, I'd be pleased to
> get some feedback.




More information about the Biopython-dev mailing list