[Biopython-dev] [Biopython] Using SeqLocation to extract subsequence
Kyle Ellrott
kellrott at gmail.com
Tue Nov 3 11:46:57 EST 2009
(Moving this thread to Biopython-dev)
I've hacked together some code, and tested it against the bacterial genome
library I had on hand (of course, eukariotic features will be more
complicated, so will need to test against them next). Examples of 'exotic'
feature location would be helpful.
I've posted the code below. I'll be moving it into my git fork, and add
some testing. Any thoughts where it should go? It seems like it would best
work as a SeqRecord method.
def FeatureIDGuess( feature ):
id = "N/A"
try:
id = feature.qualifiers['locus_tag'][0]
except KeyError:
try:
id = feature.qualifiers['plasmid'][0]
except KeyError:
pass
return id
def FeatureDescGuess( feature ):
desc = ""
try:
desc=feature.qualifiers['product'][0]
except KeyError:
pass
return desc
def ExtractFeatureDNA( record, feature ):
dna = None
if len( feature.sub_features ):
dnaStr = ""
for subFeat in feature.sub_features:
if subFeat.location_operator=='join':
subSeq = ExtractFeatureDNA( record, subFeat )
dnaStr += subSeq.seq
dna = Seq( str(dnaStr), IUPAC.unambiguous_dna)
if ( feature.strand == -1 ):
dna = dna.reverse_complement()
else:
start_pos = feature.location.start.position
end_pos = feature.location.end.position
seqStr = record.seq[ start_pos:end_pos ]
dna = Seq( str(seqStr), IUPAC.unambiguous_dna)
if ( feature.strand == -1 and feature.location_operator != 'join' ):
dna = dna.reverse_complement()
outSeq = SeqRecord( dna, FeatureIDGuess( feature ) ,
description=FeatureDescGuess( feature ) )
return outSeq
On Mon, Nov 2, 2009 at 2:30 PM, Peter <biopython at maubp.freeserve.co.uk>wrote:
> On Mon, Nov 2, 2009 at 9:31 PM, Kyle Ellrott <kellrott at gmail.com> wrote:
> >>
> >> You missed this thread earlier this month:
> >> http://lists.open-bio.org/pipermail/biopython/2009-October/005695.html
> >>
> >> Are you on the dev mailing list? I was hoping to get a little discussion
> >> going there, before moving over to the discussion list for more general
> >> comment.
> >
> > I didn't need to do it when the original discussion came through, so it
> got
> > 'filtered' ;-) I guess if multiple people are asking the same question
> > independently, it's probably a timely issue.
> >
> > I'll probably go ahead and pull the SeqRecord fork into my git fork and
> > start playing around with it.
>
> Cool - sorry if the previous email was brusque - I was in the middle
> of dinner preparation and shouldn't have been checking emails.
>
> If you just want to try the sequence extraction for a SeqFeature,
> the code is on the trunk (as noted, as a function in a unit test).
> My SeqRecord github branch is looking at other issues.
>
> Peter
>
More information about the Biopython-dev
mailing list