[Biopython-dev] [Biopython] Using SeqLocation to extract subsequence

Peter biopython at maubp.freeserve.co.uk
Tue Nov 3 23:49:33 UTC 2009


On Tue, Nov 3, 2009 at 11:06 PM, Kyle Ellrott <kellrott at gmail.com> wrote:
> I've posted a branch on my git (
> http://github.com/kellrott/biopython/tree/FeatureExtract ).  The
> Name/Description guess functions need to be finalized.  I wrote a unit test
> that extracts CDS feature dna, and then runs translate on the dna and
> compares it to the translation stored in the feature.  It passes all the
> genbank files in the Test directory except for the ones that have 'N' in the
> DNA sequence (that causes a translation exception) and one_of.gb (it refers
> to sequence outside of the file).
> More test ideas would be appreciated.

There are several things I would have done differently there. Firstly,
and perhaps most importantly, you shouldn't assume the SeqRecord
is DNA. It could be RNA or protein after all. Reuse the parent
SeqRecord's seq's alphabet

Perhaps you could comment on this other thread about the more
general problem of how to make getting the sequence (i.e. a Seq
object) for a SeqFeature available in Biopython?

http://lists.open-bio.org/pipermail/biopython-dev/2009-November/006958.html

Peter




More information about the Biopython-dev mailing list