[BioPython] question regarding unicode, biopython Seq object, DAS

Ann Loraine aloraine at gmail.com
Sat Dec 9 22:38:59 UTC 2006


Thanks very much!

Also, I found out that strings have an encode method I never noticed before:

>>> foo = u'foo'
>>> foo
u'foo'
>>> foo.encode('ascii')
'foo'

Yours,

Ann

On 12/9/06, Michiel de Hoon <mdehoon at c2b2.columbia.edu> wrote:
> Ann Loraine wrote:
> > My sax parser delivers character (sequence) data as unicode, but when
> > I make a Seq object from the unicode string and then try to reverse
> > complement the sequence, I get an exception:
>
> Can you convert the unicode string to a regular string before creating
> the Seq object? As in
>
>  >>> from Bio.Alphabet import IUPAC
>  >>> from Bio.Seq import Seq
>  >>> s = u'atcg'
>  >>> s = str(s)
>  >>> s = Seq(s, IUPAC.unambiguous_dna)
>  >>> s.reverse_complement()
> Seq('cgat', IUPACUnambiguousDNA())
>  >>>
>
> By the way, you can also use reverse_complement on a string directly:
>
>  >>> from Bio.Seq import reverse_complement
>  >>> s = 'atcg'
>  >>> reverse_complement(s)
> 'cgat'
>  >>>
>
>
> --Michiel.
>


-- 
Ann Loraine
Assistant Professor
Departments of Genetics, Biostatistics, and
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org



More information about the Biopython mailing list