[Biopython] IUPAC code contribution

Peter Cock p.j.a.cock at googlemail.com
Mon Jun 6 22:18:29 UTC 2011


On Mon, Jun 6, 2011 at 10:52 PM, Michal <mictadlo at gmail.com> wrote:
> Hello,
> I would like to contribute the following IUPAC function to Biopython:
>
> def iupac_base(alignment):
>    IUPAC = {
>      ord('N'): 'N',
>      ord('G'): 'G',
>      ord('A'): 'A',
>      ord('T'): 'T',
>      ord('C'): 'C',
>      ord('G') + ord('A'): 'R',
>      ord('T') + ord('C'): 'Y',
>      ord('A') + ord('C'): 'M',
>      ord('G') + ord('T'): 'K',
>      ord('G') + ord('C'): 'S',
>      ord('A') + ord('T'): 'W',
>      ord('A') + ord('C') + ord('T'): 'H',
>      ord('G') + ord('T') + ord('C'): 'B',
>      ord('G') + ord('C') + ord('A'): 'V',
>      ord('G') + ord('A') + ord('T'): 'D',
>      ord('G') + ord('A') + ord('T') + ord('C'): 'N'}
>
>    return IUPAC[sum(map(ord, {}.fromkeys(alignment).keys()))]
>
>
> a = iupac_base(['A','A','T','T','T'])
>
> Cheers,
> Michal

Well it would need some documentation at least (e.g. a docstring).
What is is meant to do? It looks like you just want a reverse lookup
of Bio.Data.IUPACData.ambiguous_dna_values - e.g.

from Bio.Data.IUPACData import ambiguous_dna_values
rev_map = dict((frozenset(v),k) for (k,v) in ambiguous_dna_values.iteritems())
assert rev_map[frozenset('CG')] == rev_map[frozenset('GC')]

Peter




More information about the Biopython mailing list