[Biopython-dev] SeqRecord comparison suggestion

Peter Cock p.j.a.cock at googlemail.com
Tue Nov 3 17:20:31 UTC 2015


On Thu, Oct 29, 2015 at 10:32 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Resurrecting an old thread,
>
> On Wed, Feb 5, 2014 at 6:15 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> On Tue, Feb 4, 2014 at 12:57 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
>>
>>> Thanks!
>>>
>>> My initial thoughts are that seqrecord instances should not have an __eq__
>>> operator. The equality operation here is somewhat meaningless when you
>>> consider the number of parameters that can constitute a seqrecord,
>>> especially when dealing with a genomic record or  a contig. This can lead
>>> to unexpected behaviour.
>>
>> Indeed, which is one reason why we never defined __eq__ etc for the
>> SeqRecord (how equal is equal? Same ID? Same sequence? Same
>> annotions?).
>>
>> Therefore the SeqRecord gets the default Python object equality, which
>> is are they the same object in memory?
>>
>> Peter
>
> Since this discussion we have moved the Seq objects over to doing
> string equality (with the alphabets only used to try to issue a warning).
>
> Now, on to the SeqRecord. This now has a GitHub issue, 559
>
> https://github.com/biopython/biopython/issues/559
>
> We seem to have a consensus that there is no nice way to define
> equality, and that the default id(...) based object equality is unhelpful.
> This PR makes trying to compare a SeqRecord give an exception:
>
> https://github.com/biopython/biopython/pull/560
>
> It still needs some unit tests, but with Biopython 1.66 out, I think
> we should make this change.
>
> Any comments? Anything specific to the proposed implementation
> can be done as a GitHub comment.
>
> Peter

Changes applied - thanks David & Lenna for the code contributions,
and everyone else who joined in the debate.

Peter


More information about the Biopython-dev mailing list