[Biopython-dev] [Bug 2381] translate and transcibe methods for the Seq object (in Bio.Seq)

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Wed Nov 5 23:09:01 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2381





------- Comment #39 from biopython-bugzilla at maubp.freeserve.co.uk  2008-11-05 18:09 EST -------
Created an attachment (id=1040)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1040&action=view)
Patch to Bio/Seq.py for complete CDS translation.

(In reply to comment #33)
> Instead of the "init" start codon option in attachment 1032,
> I'd also be happy with a single boolean argument which does
> start codon validation, treats this as a methionine, checks
> the sequence is a multiple of three in length, checks for a
> final stop codon, and checks for no additional stop codons.
> We'd ruled out calling this "complete", but maybe "cds"
> would be better?

This patch adds this functionality via a "complete_cds" boolean argument.

Here is how it could be applied to translate the CDS used as an example in my
comment 35, the yaaX gene in E. coli K12:

>>> from Bio.Seq import Seq
>>> my_cds = Seq("GTGAAAAAGATGCAATCTATCGTACTCGCACTTTCCCTGGTTCTGGTCGCTCCCATGGCAGCACAGGCTGCGGAAATTACGTTAGTCCCGTCAGTAAAATTACAGATAGGCGATCGTGATAATCGTGGCTATTACTGGGATGGAGGTCACTGGCGCGACCACGGCTGGTGGAAACAACATTATGAATGGCGAGGCAATCGCTGGCACCTACACGGACCGCCGCCACCGCCGCGCCACCATAAGAAAGCTCCTCATGATCATCACGGCGGTCATGGTCCAGGCAAACATCACCGCTAA")
>>> my_cds.translate(table=11)
Seq('VKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDH...HR*',
HasStopCodon(ExtendedIUPACProtein(), '*'))
>>> my_cds.translate(table=11, to_stop=True)
Seq('VKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDH...HHR',
ExtendedIUPACProtein())
>>> my_cds.translate(table=11, complete_cds=True)
Seq('MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDH...HHR',
ExtendedIUPACProtein())

I would be happy with EITHER of these options, as both can be used to translate
a complete coding sequence:

(1) the "init" argument (under another name, maybe "cds_start"?) illustrated in
attachment 1032.  This would check the start codon is valid AND translate it as
a methionine.

(2) the "complete_cds" argument (perhaps under another name, maybe "cds"?)
illustrated in this patch.  This would check the start codon is valid AND
translate it as a methionine AND check there are a whole number of codons AND
check it ends with a stop codon AND check there are no extra in-frame stop
codons.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list