[Biojava-l] GSoC 2012 - how to get started with your proposals

Andreas Prlic andreas at sdsc.edu
Sun Mar 18 16:37:06 UTC 2012


Hi,

It is great to see so much interest for GSoC again this year.  To get
started with a proposal I would recommend to look at the BioJava
project proposals from the last two years (they are on the wiki) and
see what kind of projects got funded and how those proposals were
written. Think about what you would like to work on. Get a copy of
BioJava and see how related features are working. Come up with a plan
on how to extend this.

We are fairly flexible regarding what kind of projects we will run
this summer and this really depends on the submitted project
proposals. All proposals will be compared and ranked together with
other projects from the Bio* projects. As such a good proposal is key
to get funded.

A good proposals shows

- the motivation of the student
- that the candidate is qualified to do what he is proposing
- adds useful new functionality to BioJava
- discusses possible risks and what to do about them

It is difficult to answer questions like "how should I perform this or
that project?" - There are more than one possible path and it depends
on your skills and interest what will be the best answer for this.
Overall I recommend to pick a project on a topic that is close to your
(future?)  thesis, or is of particular interest for you.

Here a couple of more thoughts which are project specific:

-  The best projects are those that you come up with yourself. If you
want to distinguish yours from every other proposal, suggest something
we have not been thinking of as of yet.

- File parsers:

if you want to work on file parsers take a look at existing ones. What
features do they provide? How can they be extended? For example if you
want to work on the CATH parser, take a look at how the SCOP parser
works. What features are available around this (access to domains) and
how can something like this be set up for CATH. Look at how the CATH
website provides files.

- Porting of algorithms:

There are several approaches possible for doing this. I recommend that
you should have some background both in C and in Java for this. Get a
copy of the algorithm you want to port, compile it, and take a look at
the source. There are several ways how to proceed for the actual port
and having a good strategy for this is key for this proposal. Perhaps
try to use your strategy on some simple test case to see how this
might work.

- BioJava in the cloud

The goal here is parallelization of existing code. What parts of
biojava are suitable for this? How can they be parallelized and moved
to current cloud infrastructure? There is a lot of online material
available for this which will be helpful here.

Andreas



More information about the Biojava-l mailing list