[Biopython-dev] RNA Alphabet: request for comments
Kristian Rother
krother at rubor.de
Wed Jun 16 09:03:37 UTC 2010
Hi Peter,
> Why do you need the _set_sequence method? Why not just put that
> small piece of code inside the __init__ method?
In _set_sequence there'll be a small parser taking care of modifications
where the one-letter abbreviations do not suffice. E.g. a sequence could
be
"CCC022UCCC"
(22U is a 5-hydroxyuridine).
--> being parsed into a list of RNAAlphabetEntries
['C','C','C','22U','C','C','C']
So the code will grow a little, but the basic idea stays the same.
If someone wants a one-letter representation, it could be "CCCxCCC", but
this is degenerate because 'x' is used for several modifications.
Best Regards,
Kristian
>>> Why not create a Seq subclass instead of your class
>>> ModifiedRNAString(str)?
>>
>> This turned out to be a lot simpler. Worked right away. New commit at:
>>
>> http://github.com/krother/biopython/commit/b0a6071f2b08a4f9bfee33a8d675c0e21b60ba70
>>
>> more comments welcome.
>
> Why do you need the _set_sequence method? Why not just put that
> small piece of code inside the __init__ method?
>
>> Next steps from my side would be:
>>
>> 1) add all modifications to the Alphabet.
>> 2) add some RNA-specific methods.
>> 3) add more tests.
>> 4) sync with latest master branch.
>> 5) request code merge.
>>
>> Best regards,
>> Kristian
>
> If this works out we should look at doing a Protein 3-letter code version
> for use with PDB sequences (I'm thinking about the modified amino acids).
>
> Peter
>
>
More information about the Biopython-dev
mailing list