[BioPython] Alphabets

Iddo Friedberg idoerg@cc.huji.ac.il
Mon, 11 Jun 2001 18:06:47 +0300 (GMT+0300)


Hi all,

I came across a problem when trying to do something with alphabets. I'm
playing around with the Align module now. This module requires the user to
use a gapped alphabet, for obvious reasons. Now, I would like to check
whether a given alignment's alphabet is a protein or a nucleic acid. In an
ungapped alphabet, I would simply use the contains method, which is a
wrapper for isinstance(self, other). However, a gapped alphabet (Gapped
class) is not inherited from any of the "master" alphabet classes,
but rather from AlphabetEncoder. In Gapped, the contains method checks for
the identity of the gap_char, as well as checking for isinstance. This,
quite rightly, raises an exception when doing the following:

gapped_prot = Alphabet.Gapped(IUPAC.IUPACProtein(),'-')
gapped_prot.contains(IUPAC.IUPACProtein())

Because IUPAC.IUPACProtein() does not have a gap_char

1) Do we want to implement this typechecking, that is, whether a gapped
alphabet is a protein, nucleic or whatever?

2) If so, how should this be implemented? I have a couple of crude ideas,
but I'd like to hear people's view in (1) before going on with this. Or
maybe I'm missing something here.

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/