[Biopython] SeqRecord slicing bug and fix

Mark Budde markbudde at gmail.com
Wed May 29 01:22:54 UTC 2013


On Tue, May 28, 2013 at 4:09 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Tue, May 28, 2013 at 11:03 PM, Mark Budde <markbudde at gmail.com> wrote:
> > There is a bug in the SeqRecord slicing behavior. The bug crops up on
> > circular records with a feature spanning the beginning and end of the
> > plasmid. Any slice outside of the feature will return the feature, and
> the
> > feature.location.end is negative.
> >
> >>>> record = SeqIO.read('pUC19_mod.gb', 'genbank')
> >>>> record
> >
> SeqRecord(seq=Seq('TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGA...GTC',
> > IUPACAmbiguousDNA()), id='pUC19', name='pUC19', description='',
> dbxrefs=[])
> >>>> record.features
> > [SeqFeature(FeatureLocation(ExactPosition(2299), ExactPosition(200),
> > strand=1), type='misc_feature')]
>
> The issue is you've got start > end, which arguably should
> raise an exception (there is a TODO in the code for that)

Is this a circular record and a feature spanning the origin?
>
Yes, it is a circular plasmid with a feature spanning the origin. There
are legitimate reasons to have features span the origin, so please do not
raise an exception. I think the provided code is the best solution to the
problem (and completely fixes the problems within my personal code when
this is an issue), but would be interested in hearing other suggestions.


>
> > On a related note, is there an appropriate way to modify the position of
> a
> > SeqFeature? I have been doing "feature.location._end =
> > ExactPosition(newEnd)" , but I was under the impression that I shouldn't
> > modify objects beginning with an underscore.
>
> Yes, things starting with a single underscore should be
> regarded as private and not used. Currently that appears
> to be setup as a read only property (which you can change
> directly using feature.location._end = new_value) and
> right now I'm not sure why that was done, but it has been
> read only for since Bio.SeqFeature was first written 12
> years ago. Maybe no one has asked till now?


Well the alternative is to make a new feature and import all of the other
atributes, but that seems like a lot of work for no practical gain.
Thanks,
Mark



More information about the Biopython mailing list