[Biopython] SeqRecord slicing bug and fix

Mark Budde markbudde at gmail.com
Tue May 28 22:03:06 UTC 2013


There is a bug in the SeqRecord slicing behavior. The bug crops up on
circular records with a feature spanning the beginning and end of the
plasmid. Any slice outside of the feature will return the feature, and the
feature.location.end is negative.

>>> record = SeqIO.read('pUC19_mod.gb', 'genbank')
>>> record
SeqRecord(seq=Seq('TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGA...GTC',
IUPACAmbiguousDNA()), id='pUC19', name='pUC19', description='', dbxrefs=[])
>>> record.features
[SeqFeature(FeatureLocation(ExactPosition(2299), ExactPosition(200),
strand=1), type='misc_feature')]
>>> record[500:600].features #This slice should contain no features
[SeqFeature(FeatureLocation(ExactPosition(1799), ExactPosition(-300),
strand=1), type='misc_feature')]

This can be fixed by modifying line 453 of SeqRecord.py...
from:
                    if start <= f.location.nofuzzy_start \
                    and f.location.nofuzzy_end <= stop:
to:
                    if start <= f.location.nofuzzy_start \
                    and f.location.nofuzzy_end <= stop \
                    and f.location.nofuzzy_start <= f.location.nofuzzy_end:

On a related note, is there an appropriate way to modify the position of a
SeqFeature? I have been doing "feature.location._end =
ExactPosition(newEnd)" , but I was under the impression that I shouldn't
modify objects beginning with an underscore.

-Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pUC19_mod.gb
Type: application/octet-stream
Size: 3674 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20130528/a3529382/attachment-0002.obj>


More information about the Biopython mailing list