[BioPython] Should Bio.SeqIO.write(...) return the number of records?

Peter biopython at maubp.freeserve.co.uk
Tue Oct 28 17:17:36 UTC 2008


Dear all,

I wanted to get some feedback on a possible enhancement to the
Bio.SeqIO.write(...) and Bio.AlignIO.write(...) functions to make them
return number of records/alignments written to the handle.  I've filed
enhancement Bug 2628 to track this idea.
http://bugzilla.open-bio.org/show_bug.cgi?id=2628

When creating a sequence (or alignment) file, it is sometimes useful
to know how many records (or alignments) were written out.  This is
easy if your records are in a list:

records = list(...)
SeqIO.write(records, handle, format)
print "Wrote %i records" % len(records)

If however your records are from a generator/iterator (e.g. a
generator expression, or some other iterator) you cannot use
len(records).  You could turn this into a list just to count them, but
this wastes memory.  It would therefore be useful to have the count
returned:

records = some_generator
count = SeqIO.write(records, handle, format)
print "Wrote %i records" % count

Currently Bio.SeqIO.write(...) and Bio.AlignIO.write(...) have no
return value, so adding a return value would be a backwards compatible
enhancement.  For a precedent, the BioSQL loader returns the number of
records loaded into the database.

Peter



More information about the Biopython mailing list