[Biopython-dev] Giving the SeqRecord a length? Evaluating it as a boolean
Peter
biopython at maubp.freeserve.co.uk
Tue Jun 10 11:37:42 UTC 2008
Something we've discussed before is making the SeqRecord more like a
Seq object, perhaps even subclassing it. I've got a patch on Bug 2507
to make some small steps in this direction - accessing elements of the
sequence by indexing the SeqRecord, i.e. letter = my_seq_record[5], or
iterating over the letters in a SeqRecord's sequence.
http://bugzilla.open-bio.org/show_bug.cgi?id=2507
In addition, I would like to give the SeqRecord a length, allowing
len(my_seq_record) rather than len(my_seq_record.seq). However, this
has a side effect on the evaluation of a SeqRecord as a boolean.
Before, any sequence was True, but if we add the __len__ method then
any SeqRecord with a zero length sequence will evaluate as False.
This is a real issue, for example you can have GenBank files without a
sequence (see our unit test cases). One example where this is
important is if you are using an iterator via the .next() method and
had been checking for a returned None by using "if record:" (something
some of the older unit tests were doing) you would have to start using
"if record is not None:" instead.
If the old behaviour is desirable (evaluating a SeqRecord as a boolean
is alway True), we could implement a __nonzero__ method to preserve
it, see: http://docs.python.org/ref/customization.html
What do people think? Would adding a __len__ method to the SeqRecord
cause trouble?
Peter
More information about the Biopython-dev
mailing list