[GSoC] GSoC 2013 is ON
Pjotr Prins
pjotr2010 at thebird.nl
Tue Apr 2 04:36:52 EDT 2013
Great idea. Can you format it as a project idea on your Wiki - take
the other ideas as an example. You can leave the two options open and
see what student reacts to.
BTW We only have the 'right' students :). The selection process is
pretty strong.
There is some ongoing discussion about core and non-core OBF projects,
but I think since BioHaskell is going strong there will be little
against adding one project idea. If it brings a really competent
student we all gain.
There is a meeting between GSoC and the OBF board on the 9th, right
after we get or do not get accepted as a mentoring organisation. I'll
make a case for inclusion of your project idea into the program.
Adding it to the wiki will help.
Pj.
On Tue, Apr 02, 2013 at 10:03:58AM +0200, Ketil Malde wrote:
>
> [CC everybody including the biohaskell list. Let me know if any of you
> want off. :-) ]
>
> Pjotr Prins <pjotr2010 at thebird.nl> writes:
>
> > http://www.open-bio.org/wiki/Google_Summer_of_Code
>
> > For Biopython (3x), BioRuby (5x) and BioJava (4x) I found project ideas.
>
> > The others are missing.
>
> > There is still a (rather small) window of opportunity for adding
> > ideas.
>
> I have one thing that might work well as a SOC project, if the right
> student could be found.
>
> Basically, I and a colleague recently developed and published a method
> and implementation for more sensitive pairwise alignments. The paper is
> here, I think (PLoS ONE seems to be down atm):
> http://dx.plos.org/10.1371/journal.pone.0054422
>
> I'm really happy about the results, if nothing else, check the SCOP
> benchmark. Although it's difficult to construct a good test case using
> more complex methods (training sets for HMMs and whatnot) I don't know
> anything that is as good as this. We're using it for annotation of
> genes.
>
> The current implementation is in Haskell, and although it works
> correctly, it is a bit slow, and more problematic, it consumes too much
> memory (so going multi-threaded, although pretty easy, won't be of any
> help).
>
> I would like to make this into a less resource intensive (and thus more
> practical) tool, and there are two ways I can think of to go about this:
>
> 1) Optimize the Haskell program
>
> 2) Reimplement the algorithm (or parts of it) in a different language
>
> Advantages of 1:
>
> * Already have a working program, and the type system makes it easy to
> refactor without introducing errors.
> * Haskell supports lots of good multi-threading programming models (like
> STM)
> * I know Haskell pretty well, and will be hopefully be able to mentor.
>
> Disadvantages:
>
> * Haskell has some good debugging tools, but they tend to work really
> poorly for large memory (i.e. it takes a long time to generate
> profiles)
> * Needs somebody with a bit (or a lot) of experience optimizing Haskell,
> and good knowledge of high-perf libraries (like vector)
>
> Advantages of 2:
>
> * Easier to get a student with adequate skills.
> * More predictable performance models in other languages.
> * Easier to compile and install for many users.
>
> Disadvantages:
>
> * Ideally, should know enough Haskell to read and understand the code.
> * Likely needs a co-mentor with knowledge of the language in question.
>
> Is this something I could or should submit as a task?
>
> -k
> --
> If I haven't seen further, it is by standing in the footprints of giants
More information about the GSoC
mailing list