[BioPython] Translation of ambiguous codons like NNN and TAN

Bruce Southey bsouthey at gmail.com
Tue Jul 22 13:20:37 UTC 2008


Peter wrote:
> On Mon, Jul 21, 2008 at 10:04 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>   
>>> The relevant document here is the IUPAC 'Nomenclature for Incompletely
>>> Specified Bases in Nucleic Acid Sequences'
>>>  (http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html).
>>>
>>> Table 4 ...
>>>       
>> I'd seen that table, but hadn't interpreted it in quite the same way
>> as you seem to have.  ...
>> So on the basis of Table 4, there is no reason to infer that IUPAC
>> explicitly suggest NNN should be translated as X.
>>     
>
> Re-reading that, I can be pedantic sometime, can't I?
>   
Sure, why not? Some one must be pedantic otherwise all these corner 
cases get missed and lead to bigger problems.

> Putting our differences of opinion about exactly what the IUPAC Table
> 4 is intended to convey, do you agree that Biopython's
> Bio.Seq.translate() function should turn NNN and TAN into X?
>
> Thanks
>
> Peter
>
>   
Yes because if you accept that 'NNN should be translated as X' then, by 
the same reasoning, you must also accept 'TAN into X' as well as similar 
cases.

For Biopython, there could a be flag or similar that causes translate() 
to either return an error for any ambiguous codons or just continues on 
using 'X'.

Bruce



More information about the Biopython mailing list