[BioPython] question regarding unicode, biopython Seq object, DAS
Ann Loraine
aloraine at gmail.com
Sat Dec 9 22:38:59 UTC 2006
Thanks very much!
Also, I found out that strings have an encode method I never noticed before:
>>> foo = u'foo'
>>> foo
u'foo'
>>> foo.encode('ascii')
'foo'
Yours,
Ann
On 12/9/06, Michiel de Hoon <mdehoon at c2b2.columbia.edu> wrote:
> Ann Loraine wrote:
> > My sax parser delivers character (sequence) data as unicode, but when
> > I make a Seq object from the unicode string and then try to reverse
> > complement the sequence, I get an exception:
>
> Can you convert the unicode string to a regular string before creating
> the Seq object? As in
>
> >>> from Bio.Alphabet import IUPAC
> >>> from Bio.Seq import Seq
> >>> s = u'atcg'
> >>> s = str(s)
> >>> s = Seq(s, IUPAC.unambiguous_dna)
> >>> s.reverse_complement()
> Seq('cgat', IUPACUnambiguousDNA())
> >>>
>
> By the way, you can also use reverse_complement on a string directly:
>
> >>> from Bio.Seq import reverse_complement
> >>> s = 'atcg'
> >>> reverse_complement(s)
> 'cgat'
> >>>
>
>
> --Michiel.
>
--
Ann Loraine
Assistant Professor
Departments of Genetics, Biostatistics, and
Section on Statistical Genetics
University of Alabama at Birmingham
http://www.ssg.uab.edu
http://www.transvar.org
More information about the Biopython
mailing list