[Biopython-dev] New Biopython release coming up / Alphabets
Marc Colosimo
mcolosimo at mitre.org
Tue Jul 11 12:01:15 EDT 2006
On Jul 6, 2006, at 12:39 PM, Michiel Jan Laurens de Hoon wrote:
> Michael Hoffman wrote:
>> [Peter]
>>> But to be honest, I have generally used plain strings in my own
>>> programs, and meddled with alphabets only when needed (e.g. for
>>> translating from DNA to protein sequences).
>
> Note that there is a function "translate" in Bio.Seq that
> translates DNA
> to protein using plain strings.
>>
>> I agree. In general, I think that the alphabet stuff adds unnecessary
>> complexity to perhaps 95 % of the sort of things I would do with
>> Biopython. But as it stands I usually use strs myself instead.
>
> It appears that most people (myself included) use plain strings
> instead
> of Seq objects (= string + Alphabet). We should check on the biopython
> mailing list if anybody really needs alphabets, and if not get rid of
> them (after the upcoming Brooklyn-release (1.42) though).
>
> --Michiel.
I am strongly arguing against removing the alphabets. You would loss
all of the cool features of Seq Objects (complement,
reverse_complement). There are similar functions under Bio.SeqUtils
but those are "Deprecated". From just looking around, I think this
would break many things.
Having said that, I do find them a pain to deal with, but that might
have more to do with the structure/layout of the classes. My simple
suggestion is to fix/change the base Alphabet classes in
Bio.Alphabet.__init__. I am trying to think of a way that we can have
a "true" GenericAlphabet class (not generic_alphabet = Alphabet() )
and using just strings. The problem is, is that I don't know if just
using letters = None (or letters = []) will cause problems down the
road (things like if x in aplabet.letters is used in many classes).
Also, I'm really confused as to what is going on in IUPAC.py with the
default_manager stuff and _bootstrap.
Marc
More information about the Biopython-dev
mailing list