[Biopython] Biopython Enhancement Proposal (BEP): Alphabets

Peter Cock p.j.a.cock at googlemail.com
Sat Aug 11 13:39:51 UTC 2018


I've never used three letter alphabets in sequence work,
they are defined in Bio.Alphabet and can be used with
the array based MutableSeq - but not everything works
properly.

I have used three letter residue codes with PDB files,
these are particularly important with modified residues
as there are more than 26 of these a single letter does
not work.

Peter


On Fri, Aug 10, 2018 at 9:35 AM, T. A. Wemyss <taw50 at cam.ac.uk> wrote:
> Dear all,
>
> Further to this email from Michiel on a Biopython Enhancement Proposal for
> Alphabets, I am writing to ask for anyone willing to share their uses of
> Alphabets. For example, if people are using three-letter codes for storing
> proteins with modified residues - or even whether the current implementation
> of Alphabets has caused them problems in any way. This will be useful for
> developing the best way that Alphabets can be implemented.
>
> Additionally, if anyone else would like to be more heavily involved in the
> BEP for Alphabets, please do get in touch.
>
> All the best,
> Thomas
>
> On Sat, Aug 4, 2018 at 3:04 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
>>
>> Dear all,
>>
>> While sequence objects in Biopython have an associated alphabet, the
>> purpose of alphabets in Biopython is currently not well-defined.
>> I can imagine these three interpretations of their purpose:
>>
>> To define how the sequence data is stored internally in a Seq object (i.e.
>> what kind of objects are in seq.data);
>> To define conceptually what the Seq object contains (e.g. this is a
>> protein, or this is DNA, or this is DNA with or without methylation);
>> To define how a Seq object should be presented to the user (e.g. as a
>> single-letter string, a three-letter string, or something else).
>>
>> (and there may be others that I have overlooked).
>>
>> To justify having alphabets as a part of Biopython, their purpose should
>> be clearly defined.
>>
>> Because of the complexity of alphabets and their use in Biopython, we felt
>> that it may be a good idea to have a PEP (Python Enhancement Proposal)-like
>> discussion to define the purpose of alphabets and their technical
>> implementation in Biopython. This would mean that somebody who is in favor
>> of having alphabets in Biopython would work out a proposal with all the
>> details to allow developers and users to think through the implications.
>>
>> Here you can find a description of PEPs and what should go in them:
>>
>> https://www.python.org/dev/peps/pep-0001/
>>
>> Not all of it is applicable to Biopython, but it may serve as a general
>> guideline.
>>
>> The Alphabet BEP (Biopython Enhancement Proposal) could be hosted on the
>> Biopython website so that everybody can follow the discussion.
>>
>> Since alphabets have been under discussion for more than 10 years, we are
>> thinking to put a time limit to the proposal (e.g., until January 1st,
>> 2020), meaning that if no agreement on the proposal is reached by then,
>> alphabets would be removed from Biopython. This would give people who are in
>> favor of alphabets to make their case, while guaranteeing that a conclusion
>> will be reached (either a well-defined and usable alphabet, or no alphabet)
>> within the next ~1.5 years.
>>
>> Any volunteers? Seq objects and therefore their alphabets are a key
>> feature of Biopython, and working through a BEP can give you the opportunity
>> to help design a major part of Biopython.
>>
>>
>>
>> Best,
>> -Michiel
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>> http://mailman.open-bio.org/mailman/listinfo/biopython
>
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython


More information about the Biopython mailing list