[Biopython-dev] SeqFeature start/end and making positions act like ints

Peter Cock p.j.a.cock at googlemail.com
Mon Sep 19 09:03:59 UTC 2011

On Sat, Sep 17, 2011 at 8:38 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Sat, Sep 17, 2011 at 2:44 PM, Eric Talevich wrote:
>> On Fri, Sep 16, 2011 at 7:01 PM, Peter Cock wrote:
>>> On Fri, Sep 16, 2011 at 9:33 PM, Eric Talevich wrote:
>>> >
>>> > Think that would work?
>>> Yes - in fact I've done that on another branch but with to avoid
>>> circular imports used hasattr(index, "extract") instead. It solves
>>> a different problem to making start/end easier to use.
>> OK, you're way ahead of me.

The actual commit wasn't that far ahead of you:

> Well, I've been thinking about this on and off for a while now.
> One issue with the __getitem__ trick is what would we do for
> the SeqRecord when sliced with a SeqFeature? Should it use
> the id and annotation from the SeqFeature or the SeqRecord?

This needs some thought.

>> The new start/end properties you implemented
>> look good to me, and I doubt there would be a serious hit
>> to performance -- plus, code that didn't need these shortcuts
>> don't have to use them.
> Good. I've realised I need to double check the integer
> methods (equals, sorting, hashes etc), but they should
> be fine.

Thinking about this more, the current _shift method of
the position objects (used in SeqRecord slicing) would
make sense as the __add__ method, thus:

BeforePosition(5) + 10 --> BeforePosition(15)

rather than currently:

BeforePosition(5)._shift(10) --> BeforePosition(15)

However, perhaps that is just making work for ourselves,
we'd have to implement code for all the mixture cases, e.g.

BeforePosition(5) + AfterPosition(10) --> UncertainPosition(15)

>> These will be handy for writing code that visualizes
>> SeqFeatures, too.
> Well, slightly easier - I have some more dramatic changes to
> the SeqFeature and FeatureLocation objects planned, but I'm
> still playing with this.

One of the key changes (which can be done without
really changing the API) is to move the database &
accession and the strand from the SeqFeature to the
FeatureLocation. These are intimately connected with
the location, as much as the start/end.

This is one of the things I've been working on here:

The other key change on that experimental branch
is moving away from sub_features for join locations
(etc). Here I was trying a new CoupoundLocation
object, but am still wondering if this should be done
in the SeqFeature or FeatureLocation object instead
(or if SeqFeature should subclass FeatureLocation).


More information about the Biopython-dev mailing list