[Biopython-dev] Project ideas for GSoC (or other student projects)
Peter Cock
p.j.a.cock at googlemail.com
Thu Mar 21 12:55:30 EDT 2013
On Wed, Mar 13, 2013 at 6:32 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
> I like Michiel's idea, and I'll suggest two more:
>
> 1. Codon alignment & analysis:
> - PAL2NAL-style conversion of unaligned nucleic acid sequences and a protein
> sequence alignment to a codon alignment. (Previously discussed)
e.g. https://github.com/peterjc/picobio/blob/master/align/align_back_trans.py
> - dN/dS and the related functions needed to calculate it.
> - Possible AlignIO or MultipleSeqAlignment tweaks to take full advantage of
> codon alignments, including validation (testing for frame shifts etc.)
http://biopython.org/wiki/Google_Summer_of_Code#Codon_alignment_and_analysis
I see you've started fleshing this idea out on the wiki, which is great.
Right now it seems a little on the light weight side - or is that deliberate
(to see if a student can take this idea and come up with a solid
project proposal in this area)? Things like model selection might
be a fun extension - I can think of a local expert who would be
great to get involved on the science side if he's interested.
Alternatively this could include doing some more general work
on the alignment object - for instance per-column-annotation
for things like a consensus sequence - or an array-of-char
implementation as an alternative to the list-of-SeqRecords
we have now (with its poor column access speed).
Peter
More information about the Biopython-dev
mailing list