[Biopython] How to use SeqRecord to get the subseq location information

Peter Cock p.j.a.cock at googlemail.com
Fri May 4 16:31:07 UTC 2012


On Fri, May 4, 2012 at 5:19 PM, Liu, XiaoChuan <xiaochuan.liu at mssm.edu> wrote:
> Hi Bow,
>
> Thank you very much for your helps!
> But according to your suggestion, I also face this problem. See below:
>
>>>> example_feature = SeqFeature(FeatureLocation(0, 88), type="mRNA", strand=-1)
>>>> simple_seq_r = SeqRecord(simple_seq, id="17_329.4",features=[example_feature])
>>>> simple_seq_r
> SeqRecord(seq=Seq('gugggaagagggguggggcccgggacuguacccaugugaggacuauucuugagu...aga', Alphabet()), id='17_329.4', name='<unknown name>', description='<unknown description>', dbxrefs=[])
>>>> simple_seq_r.features
> [SeqFeature(FeatureLocation(ExactPosition(0),ExactPosition(88)), type='mRNA', strand=-1)]
>>>> subseq=simple_seq_r[3:10]
>>>> subseq
> SeqRecord(seq=Seq('ggaagag', Alphabet()), id='17_329.4', name='<unknown name>', description='<unknown description>', dbxrefs=[])
>>>> subseq.features
> []
>
> I could not get the location information of subseq yet. Why? Thank you very much!
>

What numbers are you trying to get?

In your example the parent sequence (simple_seq_r) has a feature from
0 to 88, but when you slice a SeqRecord only features fully inside the
slice are kept - so no features are kept for the child record
(subseq). We do not breakup larger features which straddle the cut
sites.

Peter



More information about the Biopython mailing list