[Biopython-dev] [Biopython] Using SeqLocation to extract subsequence

Kyle Ellrott kellrott at gmail.com
Tue Nov 3 16:46:57 UTC 2009


(Moving this thread to Biopython-dev)

I've hacked together some code, and tested it against the bacterial genome
library I had on hand (of course, eukariotic features will be more
complicated, so will need to test against them next).  Examples of 'exotic'
feature location would be helpful.
I've posted the code below.  I'll be moving it into my git fork, and add
some testing.  Any thoughts where it should go?  It seems like it would best
work as a SeqRecord method.

def FeatureIDGuess( feature ):
    id = "N/A"
    try:
        id = feature.qualifiers['locus_tag'][0]
    except KeyError:
        try:
            id = feature.qualifiers['plasmid'][0]
        except KeyError:
            pass
    return id

def FeatureDescGuess( feature ):
    desc = ""
    try:
        desc=feature.qualifiers['product'][0]
    except KeyError:
        pass
    return desc

def ExtractFeatureDNA( record, feature ):
    dna = None
    if len( feature.sub_features ):
        dnaStr = ""
        for subFeat in feature.sub_features:
            if subFeat.location_operator=='join':
                subSeq = ExtractFeatureDNA( record, subFeat )
                dnaStr += subSeq.seq
        dna = Seq( str(dnaStr), IUPAC.unambiguous_dna)
        if ( feature.strand == -1 ):
            dna = dna.reverse_complement()
    else:
        start_pos = feature.location.start.position
        end_pos   = feature.location.end.position
        seqStr = record.seq[ start_pos:end_pos ]
        dna = Seq( str(seqStr), IUPAC.unambiguous_dna)
        if ( feature.strand == -1 and feature.location_operator != 'join' ):
            dna = dna.reverse_complement()
    outSeq = SeqRecord( dna, FeatureIDGuess( feature ) ,
description=FeatureDescGuess( feature ) )
    return outSeq


On Mon, Nov 2, 2009 at 2:30 PM, Peter <biopython at maubp.freeserve.co.uk>wrote:

> On Mon, Nov 2, 2009 at 9:31 PM, Kyle Ellrott <kellrott at gmail.com> wrote:
> >>
> >> You missed this thread earlier this month:
> >> http://lists.open-bio.org/pipermail/biopython/2009-October/005695.html
> >>
> >> Are you on the dev mailing list? I was hoping to get a little discussion
> >> going there, before moving over to the discussion list for more general
> >> comment.
> >
> > I didn't need to do it when the original discussion came through, so it
> got
> > 'filtered' ;-)  I guess if multiple people are asking the same question
> > independently, it's probably a timely issue.
> >
> > I'll probably go ahead and pull the SeqRecord fork into my git fork and
> > start playing around with it.
>
> Cool - sorry if the previous email was brusque - I was in the middle
> of dinner preparation and shouldn't have been checking emails.
>
> If you just want to try the sequence extraction for a SeqFeature,
> the code is on the trunk (as noted, as a function in a unit test).
> My SeqRecord github branch is looking at other issues.
>
> Peter
>



More information about the Biopython-dev mailing list