[Biopython] sequence coordinate mapping

Chris Fields cjfields at illinois.edu
Fri Jun 18 14:08:21 UTC 2010


On Jun 18, 2010, at 8:39 AM, Peter wrote:

> On Fri, Jun 18, 2010 at 1:58 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> Reece and Peter;
>> 
>> Peter wrote:
>>> Something like this? This implements __contains__ on the SeqFeature
>>> so that you can check if a simple location (integer) is within a feature.
>>> http://github.com/peterjc/biopython/tree/feature-in
>>> 
>>> There is a docstring with examples, just look at the diff here:
>>> http://github.com/peterjc/biopython/commit/83c44e8f6ee62a9c5855b603cb3c080d367e23d6
>> 
>> That's nice.
> 
> Nice enough to be worth committing in its own right?
> 
>> The next part would be remapping the coordinates so
>> once you have the feature you can easily address the relative
>> position you are interested in.
> 
> Perhaps one approach would be to do this in the SeqFeature. If we
> define a SeqFeature's length in the natural way, then we have
> len(SeqFeature) == len(SeqFeature.extract(parent_seq)).
> Now we have two coordinates systems, 0 to len(SeqFeature) and
> the regions it describes on the parent sequence. Then we could
> discuss a pair of methods on the SeqFeature for converting
> between the two coordinate systems. Once you have that, the
> special case of amino acid coordinates is much easier to do
> (account for where the start codon is, divide by three).
> 
> I've made another commit on the __contains__ branch to
> also implement __len__ for the SeqFeature:
> http://github.com/peterjc/biopython/commit/74b264acacd228d64859d28d75e2c30a8030d03f
> 
> Peter

We essentially do this with Bioperl features, locations, and ranges (in fact, the coordinate system previously mentioned use these).   Basically, anything that is-a Range can be compared to anything else that is-a Range (this has bitten us a well :).  Beyond the module documentation the test suite has a bit more on it, and Aaron Mackey has a presentation up on slideshare that touches upon the Bio::Coordinate implementation: 

http://www.slideshare.net/bosc_2008/mackey-bio-perl-bosc2008

chris



More information about the Biopython mailing list