[Biopython] Cannot make SeqFeature() comparable?

Chevreux, Bastien bastien.chevreux at dsm.com
Tue Jan 31 19:43:09 UTC 2017


Which leaves me no option than to explain the whole sorting logic and not just the subset I did to keep my basic problem easily solvable on this list :-)

I want/need, for GenBank output, have the usual “interleaved” sorting of features, i.e.,

1.       First by start position

2.       On equal start, sort by type (first “gene”, then “regulatory”, then “mRNA”, “CDS”, etc.pp)

3.       Maybe on equal start and type, sort by feature attributes (locus_tag, name etc.)
(Maybe 2 & 3 need to be inversed in sorting logic, but that question is for another day)

I had considered using key and attrgetter, but these are not flexible enough for the above I think, are they?

Currently I do not see a way other than temporary monkey patching for this but would be happy to hear about one.

Best,
  Bastien

--
DSM Nutritional Products Microbia Inc | Bioinformatics
60 Westview Street | Lexington, MA 02421 | United States
Phone +1 781 259 7613 | Fax +1 781 259 0615

From: lenna.peterson at gmail.com [mailto:lenna.peterson at gmail.com] On Behalf Of Lenna Peterson
Sent: Tuesday, January 31, 2017 2:13 PM
To: Chevreux, Bastien <bastien.chevreux at dsm.com>
Cc: Joshua Klein <mobiusklein at gmail.com>; Peter Cock <p.j.a.cock at googlemail.com>; biopython at biopython.org
Subject: Re: [Biopython] Cannot make SeqFeature() comparable?

--- This mail has been sent from an external source ---
re: Joshua's post about the key argument, here is an example of sorting SeqFeatures by location start without (potentially error-prone) monkey patching:

import operator
sorted_features = sorted([f1, f2], key=operator.attrgetter("location.start"))

https://docs.python.org/2/library/operator.html#operator.attrgetter

Cheers,

Lenna

On Tue, Jan 31, 2017 at 9:31 AM, Chevreux, Bastien <bastien.chevreux at dsm.com<mailto:bastien.chevreux at dsm.com>> wrote:
> From: Joshua Klein [mailto:mobiusklein at gmail.com<mailto:mobiusklein at gmail.com>]
> […] When assigning to the class itself, not the module, the new
> comparator function is called

Yay, that worked, learning something new every day. Thanks a million.

Peter: the ultimate goal of that request was to be able to call sort() on features, with sometimes different and very custom sort criteria. Nothing which would fit BioPython really.

Best,
  Bastien

--
DSM Nutritional Products Microbia Inc | Bioinformatics
60 Westview Street | Lexington, MA 02421 | United States
Phone +1 781 259 7613 | Fax +1 781 259 0615

From: Joshua Klein [mailto:mobiusklein at gmail.com<mailto:mobiusklein at gmail.com>]
Sent: Tuesday, January 31, 2017 7:48 AM
To: Peter Cock <p.j.a.cock at googlemail.com<mailto:p.j.a.cock at googlemail.com>>
Cc: Chevreux, Bastien <bastien.chevreux at dsm.com<mailto:bastien.chevreux at dsm.com>>; biopython at biopython.org<mailto:biopython at biopython.org>
Subject: Re: [Biopython] Cannot make SeqFeature() comparable?

--- This mail has been sent from an external source ---

The reason the original code snippet doesn’t seem to be working as expected is that the cmp1 function is assigned to the __lt__ attribute of the SeqFeature module, not the SeqFeature class, which is located at SeqFeature.SeqFeature. When assigning to the class itself, not the module, the new comparator function is called.

This sort of patching works differently for old-style and new-style classes, having to do with how special methods are looked up. Old style classes look up special methods on the instance, new style classes look them up on the instance’s class.
​

On Tue, Jan 31, 2017 at 4:07 AM, Peter Cock <p.j.a.cock at googlemail.com<mailto:p.j.a.cock at googlemail.com>> wrote:
Hi Bastien,

I'm not immediately sure if "monkey patching" the class
methods at run time like that would work in principle.
If you insert a print into it, it does not seem to be invoked.

It might be worth trying a modified Biopython, or an
explicit subclass to narrow down where this breaks.

Or more simply, can you just do the start position
comparison explicitly if that's what you want to use?

f1.location.start < f2.location.start

Peter


On Mon, Jan 30, 2017 at 11:05 PM, Chevreux, Bastien
<bastien.chevreux at dsm.com<mailto:bastien.chevreux at dsm.com>> wrote:
> Hi there,
>
>
>
> I have a problem making the SeqFeature() class comparable by providing a
> __lt__ function. Consider the following:
>
>
>
> ------------------------------------------------------------------
>
> #!/usr/bin/env python3
>
>
>
> from Bio import SeqFeature
>
>
>
> def cmp1(this,other):
>
>     return int(this.location.start) < int(other.location.start);
>
>
>
> SeqFeature.__lt__=cmp1;
>
> f1 = SeqFeature.SeqFeature(SeqFeature.FeatureLocation(10, 200));
>
> f2 = SeqFeature.SeqFeature(SeqFeature.FeatureLocation(1000, 1200));
>
>
>
> if f1<f2:
>
>     print("f1<f2");
>
> else:
>
>     print("nope, f1>=f2");
>
> ------------------------------------------------------------------
>
>
>
> The code above runs with an error message:
>
>     if f1<f2:
>
> TypeError: unorderable types: SeqFeature() < SeqFeature()
>
>
>
> What I do not understand is that this should be the canonical recipe for
> making any class comparable via LT operator. Compare to the following code
> which runs without problems:
>
>
>
> ------------------------------------------------------------------
>
> #!/usr/bin/env python3
>
>
>
> class myclass():
>
>     def __init__(self, value):
>
>         self.bla=value;
>
>
>
> def cmp2(this,other):
>
>     return this.bla < other.bla;
>
>
>
> myclass.__lt__=cmp2;
>
> m1=myclass(1);
>
> m2=myclass(2);
>
>
>
> if m1<m2:
>
>     print("m1<m2");
>
> else:
>
>     print("nope, m1>=m2");
>
> ------------------------------------------------------------------
>
>
>
> What am I missing?
>
>
>
> Best,
>
>   Bastien
>
>
>
> --
> DSM Nutritional Products Microbia Inc | Bioinformatics
> 60 Westview Street | Lexington, MA 02421 | United States
> Phone +1 781 259 7613 | Fax +1 781 259 0615
>
>
>
>
> ________________________________
>
> DISCLAIMER:
> This e-mail is for the intended recipient only.
> If you have received it by mistake please let us know by reply and then
> delete it from your system; access, disclosure, copying, distribution or
> reliance on any of it by anyone else is prohibited.
> If you as intended recipient have received this e-mail incorrectly, please
> notify the sender (via e-mail) immediately.
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org<mailto:Biopython at mailman.open-bio.org>
> http://mailman.open-bio.org/mailman/listinfo/biopython
_______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org<mailto:Biopython at mailman.open-bio.org>
http://mailman.open-bio.org/mailman/listinfo/biopython


_______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org<mailto:Biopython at mailman.open-bio.org>
http://mailman.open-bio.org/mailman/listinfo/biopython

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170131/d76d358c/attachment-0001.html>


More information about the Biopython mailing list