[Biopython] SeqRecord slicing bug and fix

Peter Cock p.j.a.cock at googlemail.com
Tue May 28 23:09:25 UTC 2013


On Tue, May 28, 2013 at 11:03 PM, Mark Budde <markbudde at gmail.com> wrote:
> There is a bug in the SeqRecord slicing behavior. The bug crops up on
> circular records with a feature spanning the beginning and end of the
> plasmid. Any slice outside of the feature will return the feature, and the
> feature.location.end is negative.
>
>>>> record = SeqIO.read('pUC19_mod.gb', 'genbank')
>>>> record
> SeqRecord(seq=Seq('TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGA...GTC',
> IUPACAmbiguousDNA()), id='pUC19', name='pUC19', description='', dbxrefs=[])
>>>> record.features
> [SeqFeature(FeatureLocation(ExactPosition(2299), ExactPosition(200),
> strand=1), type='misc_feature')]

The issue is you've got start > end, which arguably should
raise an exception (there is a TODO in the code for that).

Is this a circular record and a feature spanning the origin?

> On a related note, is there an appropriate way to modify the position of a
> SeqFeature? I have been doing "feature.location._end =
> ExactPosition(newEnd)" , but I was under the impression that I shouldn't
> modify objects beginning with an underscore.

Yes, things starting with a single underscore should be
regarded as private and not used. Currently that appears
to be setup as a read only property (which you can change
directly using feature.location._end = new_value) and
right now I'm not sure why that was done, but it has been
read only for since Bio.SeqFeature was first written 12
years ago. Maybe no one has asked till now?

Peter



More information about the Biopython mailing list