[Biopython] Alphabet question

Wed May 12 09:58:11 UTC 2010

On Wed, May 12, 2010 at 12:34 AM, Eric Talevich <eric.talevich at gmail.com> wrote:
> On Tue, May 11, 2010 at 7:11 PM, Sebastian Bassi <sbassi at gmail.com> wrote:
>
>> I tried this:
>>
>> >>> from Bio.Seq import Seq
>> >>> from Bio.Alphabet import IUPAC
>> >>> seq_1 = Seq('GATCGATGGGCCTATATAGGA', IUPAC.unambiguous_dna)
>> >>> seq_1
>> Seq('GATCGATGGGCCTATATAGGA', IUPACUnambiguousDNA())
>>
>> I wonder why the alphabet argument is entered as IUPAC.unambiguous_dna
>> but when I see the object, this argument is printed as
>> IUPACUnambiguousDNA().
>>
>
> Hi Sebastian,
>
> The IUPAC.unambiguous_dna object is a copy of IUPACUnambiguousDNA(), already
> instantiated. It shows up in the source code of Bio/Alphabet/IUPAC.py as:
>
> unambiguous_dna = IUPACUnambiguousDNA()
>
> So you could do:
>
>>>> from Bio.Seq import Seq
>>>> from Bio.Alphabet import IUPAC
>>>> seq_1 = Seq('GATCGATGGGCCTATATAGGA', IUPAC.IUPACUnambiguousDNA())
>>>> seq_1
> Seq('GATCGATGGGCCTATATAGGA', IUPACUnambiguousDNA())
>
> Looking at it that way, the repr() is kind of deceptive. It doesn't match
> unless you've imported the IUPACUnambiguousDNA class directly.

Eric is right that it should work if you do the import first, but please note
that the repr of a Seq object will truncate the sequence for longer
examples. The aim isn't to support eval(repr(obj)), but to be useful for
debugging or working at the python prompt.

Peter