[Biopython] Edit feature's location

Peter Cock p.j.a.cock at googlemail.com
Mon Jul 28 09:09:04 UTC 2014


On Fri, Jul 25, 2014 at 1:55 PM, Ilya Flyamer <flyamer at gmail.com> wrote:
> Hi,
>
> is there a way to either change SeqFeature's location in place or create a
> copy with a different location? Assigning to feature.location.start raises
> an AttributeError.

The FeatureLocation properties are (currently at least) read only
attributes. You could replace the current SeqFeature's location
with a new FeatureLocation instead:

Example to get a SeqFeature,

>>> from Bio import SeqIO
>>> record = SeqIO.read("NC_000932.g", "genbank")
>>> f = record.features[10]
>>> f
SeqFeature(FeatureLocation(ExactPosition(2055), ExactPosition(3570),
strand=-1), type='CDS')
>>> print f
type: CDS
location: [2055:3570](-)
qualifiers:
    Key: codon_start, Value: ['1']
    Key: db_xref, Value: ['GI:126022795', 'GeneID:844797']
    Key: gene, Value: ['matK']
    Key: locus_tag, Value: ['ArthCp003']
    Key: product, Value: ['maturase K']
    Key: protein_id, Value: ['NP_051040.2']
    Key: transl_table, Value: ['11']
    Key: translation, Value:
['MDKFQGYLEFDGARQQSFLYPLFFREYIYVLAYDHGLNRLNRNRYIFLENADYDKKYSSLITKRLILRMYEQNRLIIPTKDVNQNSFLGHTSLFYYQMISVLFAVIVEIPFSLRLGSSFQGKQLKKSYNLQSIHSIFPFLEDKLGHFNYVLDVLIPYPIHLEILVQTLRYRVKDASSLHFFRFCLYEYCNWKNFYIKKKSILNPRFFLFLYNSHVCEYESIFFFLRKRSSHLRSTSYEVLFERIVFYGKIHHFFKVFVNNFPAILGLLKDPFIHYVRYHGRCILATKDTPLLMNKWKYYFVNLWQCYFSVWFQSQKVNINQLSKDNLEFLGYLSSLRLNPLVVRSQMLENSFLIDNVRIKLDSKIPISSIIGSLAKDKFCNVLGHPISKATWTDSSDSDILNRFVRICRNISHYYSGSSKKKNLYRIKYILRLCCVKTLARKHKSTVRTFLKRLGSGLLEEFLTGEDQVLSLIFPRSYYASKRLYRVRIWYLDILYLNDLVNHE']
>>> f.location
FeatureLocation(ExactPosition(2055), ExactPosition(3570), strand=-1)
>>> print f.location
[2055:3570](-)

Now let's change the location:

>>> from Bio.SeqFeature import FeatureLocation
>>> f.location = FeatureLocation(2049, 3570, strand=-1)
>>> f.location
FeatureLocation(ExactPosition(2049), ExactPosition(3570), strand=-1)
>>> print f.location
[2049:3570](-)

> My ultimate goal is to move all features in a genbank file by some specific
> number of nucleotides (for example, add 1000 to all coordinates). If someone
> can help me and tell about an easier way, I will appreciate it.

The easy way is just to add 1000 to the feature location, which will give
you a new FeatureLocation with shifted coordinates:

>>> loc = FeatureLocation(2049, 3564, strand=-1)
>>> loc
FeatureLocation(ExactPosition(2049), ExactPosition(3564), strand=-1)
>>> loc + 6
FeatureLocation(ExactPosition(2055), ExactPosition(3570), strand=-1)

On in situ, replacing the old FeatureLocation object:

>>> loc += 6
>>> loc
FeatureLocation(ExactPosition(2055), ExactPosition(3570), strand=-1)

Also, you might have missed this kind of thing in the tutorial - adding
to a SeqRecord will adjust the feature locations accordingly:

>>> from Bio import SeqIO
>>> record = SeqIO.read("NC_000932.gb", "genbank")
>>> record.features[10].location
FeatureLocation(ExactPosition(2055), ExactPosition(3570), strand=-1)
>>> new = "N"*1000 + record
>>> new.features[10].location
FeatureLocation(ExactPosition(3055), ExactPosition(4570), strand=-1)

See also "Adding SeqRecord objects" in the Tutorial:
http://biopython.org/DIST/docs/tutorial/Tutorial.html

Peter



More information about the Biopython mailing list