[Biopython-dev] SeqFeature start/end and making positions act like ints
Eric Talevich
eric.talevich at gmail.com
Sat Sep 17 09:44:21 EDT 2011
On Fri, Sep 16, 2011 at 7:01 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:
> On Fri, Sep 16, 2011 at 9:33 PM, Eric Talevich <eric.talevich at gmail.com>
> wrote:
> > On Fri, Sep 16, 2011 at 12:31 PM, Peter Cock <p.j.a.cock at googlemail.com>
> > wrote:
> >>
> >> Hi all,
> >>
> >> We've previously discussed adding start/end properties
> >> to the SeqFeature returning integers - which would be
> >> useful but inconsistent with the FeatureLocation which
> >> returns Position objects:
> >>
> >> https://redmine.open-bio.org/issues/2818
> >>
> >> After an interesting discussion with Leighton, I spent
> >> the afternoon making (most of the) Position objects
> >> subclass int - so that they can be used like integers
> >> (with the fuzzy information retained but generally
> >> ignored except for writing the features out again).
> >>
> >> This means we can have SeqFeature start/end
> >> properties which like those of the FeatureLocation
> >> return position objects - and they are actually easy
> >> to use (except for some very extreme cases).
> >> e.g. You can use them to slice a sequence.
> >>
> >> The code is on a branch here:
> >> https://github.com/peterjc/biopython/tree/int_pos
> >>
> >> It is almost 100% backwards compatible. Some
> >> of the arguments for creating a fuzzy position
> >> (and their __repr__) have changed, and some
> >> of their attributes, but we feel this is unlikely to
> >> actually affect anyone. We rather suspect only
> >> the SeqIO parsers actually create or use the
> >> fuzzy objects in the first place!
> >>
> >> In terms of usability I think this is a worthwhile
> >> improvement. The new class heirachy is a bit
> >> more complex though - and I have not looked
> >> at the performance implications at all.
> >>
> >> Would anyone like to review this please?
> >>
> >
> > Here's another way to do it, maybe -- modify Seq.Seq.__getitem__ to also
> > check if it's been given a SeqFeature, and if so, handle the joins there.
> > The handling of fuzziness could happen in here or use the new .start and
> > .end properties.
> >
>
[...]
>
> >
> > Think that would work?
>
> Yes - in fact I've done that on another branch but with to avoid
> circular imports used hasattr(index, "extract") instead. It solves
> a different problem to making start/end easier to use.
>
>
OK, you're way ahead of me. The new start/end properties you implemented
look good to me, and I doubt there would be a serious hit to performance --
plus, code that didn't need these shortcuts don't have to use them. These
will be handy for writing code that visualizes SeqFeatures, too.
More information about the Biopython-dev
mailing list