[Biopython] Biopython Digest, Vol 152, Issue 2

Sampson, Jared Jared.Sampson at nyumc.org
Mon Aug 3 18:17:35 UTC 2015


Hi Carlos -

Thanks for sharing your code.

"YTN" would also match both of Phe's codons ("TTT" and "TTC")
I just run the original Perl Degen_v1.4.pl script and both TTT and TTC
get degenerated to "TTY" using both translation tables 1 and 5.

Yes.  What I had thought was a problem in the Perl Degen script was not with how it handles Phe's codons, but rather with how it handles Leu's 2 codons that are in the same block as those of Phe, namely TTA and TTG.  Currently, those both result with "YTN" as output, which matches all Leu and Phe codons, and is therefore is not a valid degenerate codon for Leu.  Handling (TTA, TTG) and (CTT, CTC, CTA, CTG) separately for Leucine (resulting in TTR and CTN, respectively), and similar handling of Ser and Arg would "solve" this.

However, after reading some of the background on the PhyloTools website (which I now realize is maintained by Andreas Zwick, whose degeneracy tables your code uses--thank you for including the reference), I hereby retract my assertion that the degenerate codons for Leu/Arg/Ser were produced incorrectly.  I wasn't aware that this is actually a standard published method (by Zwick, et al.) for creating degenerate sequences.  My previous idea of degeneracy was that creating a degenerate a codon shouldn't allow for a mutation to be introduced, but clearly there are reasons to do it this way.

Cheers,
Jared

--
Jared Sampson
Xiangpeng Kong Lab
NYU Langone Medical Center
http://kong.med.nyu.edu/






On Aug 2, 2015, at 8:24 AM, Carlos Pena <mycalesis at gmail.com<mailto:mycalesis at gmail.com>> wrote:

Thanks Jeremy, Jared,


I will take a better look at your script. I had already come up with a
Python package to do the trick: https://github.com/carlosp420/degenerate-dna

I tried to make it easy to use:

from degenerate_dna import Degenera
dna = 'AGTTCT'
res = Degenera(dna=dna, table=1, method='S')
res.degenerate()
res.degenerated
'AGYAGY'


But so far I have implemented the degenerated sequences for the Standard
Code and the Invertebrate Mitochondrial. In our lab, we don't urgently
need the other translation codes yet.

"YTN" would also match both of Phe's codons ("TTT" and "TTC")
I just run the original Perl Degen_v1.4.pl script and both TTT and TTC
get degenerated to "TTY" using both translation tables 1 and 5.


cheers,


carlos


On 02.08.2015 15:00, biopython-request at mailman.open-bio.org<mailto:biopython-request at mailman.open-bio.org> wrote:
Send Biopython mailing list submissions to
biopython at mailman.open-bio.org<mailto:biopython at mailman.open-bio.org>

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.open-bio.org/mailman/listinfo/biopython
or, via email, send a message with subject or body 'help' to
biopython-request at mailman.open-bio.org

You can reach the person managing the list at
biopython-owner at mailman.open-bio.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Biopython digest..."


Today's Topics:

  1. Re: Is there any Biopython tool to degenerate a nucleotide
     sequence (Jeremy)


----------------------------------------------------------------------

Message: 1
Date: Sat, 1 Aug 2015 23:23:23 +0000 (UTC)
From: Jeremy <Jeremy.molbio at gmail.com>
To: biopython at biopython.org
Subject: Re: [Biopython] Is there any Biopython tool to degenerate a
nucleotide sequence
Message-ID: <loom.20150802T011012-309 at post.gmane.org>
Content-Type: text/plain; charset=utf-8

Sampson, Jared <Jared.Sampson <at> nyumc.org> writes:



Hi Jeremy -?

Nice work, thanks for sharing.


However (and someone please correct me if I'm wrong here!), it looks like
the current Leucine substitution, "YTN" would also match both of Phe's
codons ("TTT" and "TTC"), and the current Arginine ("MGN") also matches two
of Serine's codons ("AGT" and "AGC").
?FWIW, the?PhyloTools script?also produces the same erroneous degenerate
codons. ?I've sent the contact address on that site a bug report.

I've updated a?fork?of your original gist to implement fixes for these
residues, along with a couple stylistic changes (hope you don't mind).
?Please feel free to incorporate them
into your original. ?If you want to double check the rest of the
degen_dict, there's a?nice table on Wikipedia.

Also, if you're looking to make other improvements, it might be nice to
add a "frame=1" argument to degenerate_sequence() to optionally accommodate
the other two reading frames rather than chopping leftover bases.

Cheers,
Jared




--?
Jared Sampson
Xiangpeng Kong Lab
NYU Langone Medical Centerhttp://kong.med.nyu.edu/



On Jul 31, 2015, at 1:39 AM, Jeremy <Jeremy.molbio <at> gmail.com> wrote:


Carlos Pena <mycalesis <at>?gmail.com> writes:

Dear Biopython members,
I want to take a nucleotide string and degenerate those bases that can
undergo synonymous change.
For example, a string of just one codon.
* Input: ?AAC
* Output: ?AAY
Since both AAC and AAT are translated to Asparagine (N) we can
degenerate this codon to AAY (because the third position could produce a
synonymous change).
This is already solved in the Perl library
Degenhttp://www.phylotools.com/ptdegendocumentation.htm
I could use some glue to execute this Perl code from Python but
I cannot include this library in my project because they are using the
GPL license while I use BSD.
So I thought asking around before writing a Python script to do this for?

me.

thanks for any pointers,
carlos


Hi Carlos,
I hacked up something that should return the same output as the Degen
1.4?
Perlweb tool. ?
The gist can be found here:
?https://gist.github.com/biojerm/6242381eb4ad3ef18ac6
I am pretty new to both Python and Biopython, so the please let me know
if?
you have any feedback on both form, styling, and/or function.?
I know the method is currently quite fragile. Below are a few thoughts
on?
the method's weaknesses
1)The method does not handle sequences that are not evenly divisible by
3.
2)I think the method would be a lot more useful if you could call it on
a?
single or set of FASTA files or a GB files. ?But, I have not learned
?how?
to program that yet. ?
3) I probably should return the degenerate sequences as Seq files, but
at?
the moment they are simple strings.
4)Tests...need to figure those out too. ?
Please let me know if you find this useful or and if there are any must?
have features for your purposes.
Thanks,
Jeremy
_______________________________________________
Biopython mailing list ?- ?Biopython <at> mailman.open-
bio.orghttp://mailman.open-bio.org/mailman/listinfo/biopython









------------------------------------------------------------
This email message, including any attachments, is for the sole use of the
intended recipient(s) and may contain information that is proprietary,
confidential, and exempt from disclosure under applicable law. Any
unauthorized review, use, disclosure, or distribution is prohibited. If you
have received this email in error please notify the sender by return email
and delete the original message. Please note, the recipient should check
this email and any attachments for the presence of viruses. The
organization accepts no liability for any damage caused by any virus
transmitted by this email.
=================================


_______________________________________________
Biopython mailing list  -  Biopython <at> mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/biopython


Hi Jared,
Thanks for editing the code.  Your improvements in both style and function
are greatly appreciated.  I originally  was trying to mimic the output of
the bioPerl function.  However, I think your improvements help to maintain
accuracy of the original sequence.  I will incorporate your method into the
final function.  I will also try and introduce different frames and
possibly different codon tables.



------------------------------

_______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/biopython

End of Biopython Digest, Vol 152, Issue 2
*****************************************


--
Dr. Carlos Peña
Laboratory of Genetics
Department of Biology
University of Turku
20014 Turku
FINLAND

* Associated Editor, Revista Peruana de Biología
http://revistasinvestigacion.unmsm.edu.pe/index.php/rpb
* The Nymphalidae Systematics Group
http://nymphalidae.net
_______________________________________________
Biopython mailing list  -  Biopython at mailman.open-bio.org
http://mailman.open-bio.org/mailman/listinfo/biopython


------------------------------------------------------------
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
=================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150803/04887b44/attachment.html>


More information about the Biopython mailing list