[Biopython] iterating over FeatureLocation
Michael Thon
mike.thon at gmail.com
Mon Jan 13 16:07:45 UTC 2014
Here are two examples from the GenBank format file (not from GenBank though)
CDS order(6621..6658,6739..6985)
/Source="maker"
/codon_start=1
/ID="CFIO01_14847-RA:cds"
/label=“CDS"
CDS 419..2374
/Source="maker"
/codon_start=1
/ID="CFIO01_05899-RA:cds"
/label=“CDS"
if the feature is a simple feature, then I just need to access its start and end. If its a compound feature then I need to iterate over each segment, accessing the start and end.
What I am doing at the moment is this:
if feat._sub_features:
for sf in feat.sub_features:
start = sf.location.start
…
else:
start = feat.location.start
…
it works, I think. Is there a better way?
Also, is there an easy way to get the sequence represented by the seqfeature, if it is made up of CompoundLocations? These features are CDSs where each sub-feature is an exon. I need to splice them all together and get the translation.
Thanks
On Jan 13, 2014, at 4:38 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Mon, Jan 13, 2014 at 3:09 PM, Michael Thon <mike.thon at gmail.com> wrote:
>> I need to iterate over all the features of a sequence, and then
>> iterate over the locations/sublocations in each feature. I’m not
>> sure how to work with the sublocations though:
>>
>> I need to do something like this:
>>
>> for feat in seq.features:
>> for loc in feat.locations:
>> start = loc.start
>> …
>>
>> which does not work but maybe shows what I need to do.
>> Can anyone help me out?
>
> Are you talking about join locations? Could you give an example
> (e.g. link to a GenBank file) and what you want to look at?
>
> Peter
>
> P.S. This changed a bit back in Biopython 1.62 with the introduction
> of the CompoundLocation object.
More information about the Biopython
mailing list