[Biopython-dev] Changing Seq equality

Fri Mar 12 13:32:32 UTC 2010

Hi all,

I'd like to proceed as outlined below for Biopython 1.54,
i.e. don't change the current Seq equality but add a warning
that we plan to change it.

Should we have a discussion on the main list first?

Peter

On Mon, Feb 22, 2010 at 2:48 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> Hi all,
>
> I've just got back from Japan - Brad and I were fortunate to be
> able to attend the DBCLS BioHackathon 2010 held in Tokyo,
> http://hackathon3.dbcls.jp/
>
> As Brad already mentioned in passing, we also managed to have
> dinner one evening with Michiel, and had an informal chat about
> Biopython plans. Expect a few more emails on other topics to
> follow.
>
> One of the short term aims we agreed on was to press ahead
> with the Seq equality changes outlined on this thread late last
> year. Mailing list archive link:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-November/007021.html
>
> To recap, the agreed best behaviour was to make Seq equality
> act like string equality, but to raise a Python warning when
> incompatible alphabets are compared (e.g. DNA to Protein).
> This also applies to all the other comparison operators:
> not equal, less than, greater than, less than or equal, and
> greater than or equal.
>
> This is my outline plan for the change:
>
> For Biopython up to 1.53, Seq class uses object equality,
> seq1==seq2 acts as id(seq1)==id(seq2)
>
> For Biopython 1.54 (and perhaps a few more releases),
> the Seq classes will still use object equality but will trigger
> a warning suggesting explicit use of  id(seq1)==id(seq2)
> or str(seq1)==str(seq2) as appropriate.
>
> For Biopython 1.xx (maybe 1.55 or 1.56?) the Seq classes
> will switch to using string equality (with an alphabet aware
> warning for comparing DNA to RNA etc), but will also trigger
> a warning that this is a change from previous releases, and
> suggest in the short term the continued explicit use of either
> id(seq1)==id(seq2) for object identity or str(seq1)==str(seq2)
> for string identity.
>
> For Biopython 1.yy (maybe 1.57?) the Seq classes will
> use string equality (with an alphabet aware warning for
> comparing DNA to RNA etc), without any warning about
> this being a change from historic behaviour.
>
> These warning messages could also point at a wiki page,
> and we'd need a FAQ entry in the tutorial as well. The
> aim of this slightly drawn out switch is to try and make
> sure all users are aware of the change, even if they
> only update their copy of Biopython every few releases.
>
> Does that all sound sensible? If so, we should probably
> have an announcement on the main mailing list, in case
> there are any other views.
>
> Other more complex options include a flag for switching
> between the modes - but that complexity doesn't seem
> such a good idea to me. All my own code and most of
> the unit tests use str(seq1)==str(seq2) explicitly anyway.
> The only exception is some of the genetic algorithm unit
> tests which do seem to want explicit object identity.
>
> Regards,
>
> Peter
>