[BioRuby] Google Summer of Code BioRuby project: Looking for co-mentors

Wed Apr 1 01:16:01 UTC 2009

Hi:

As you probably read on previous messages, NESCent is again having a 
Google Summer of Code program 
(http://hackathon.nescent.org/Phyloinformatics_Summer_of_Code_2009).

I am serving as the mentor of a project entitled "phyloXML support in 
BioRuby" (see: 
https://www.nescent.org/wg/phyloinformatics/index.php?title=Phyloinformatics_Summer_of_Code_2009#phyloXML_support_in_BioRuby).

In order to ensure a successful outcome (should the project be 
accepted), as well as to improve the chances of being accepted, I am 
looking for people willing to serve as co-mentors.

Christian

PS: Here is the full description of the project:

      phyloXML support in BioRuby

Rationale 
    Evolutionary trees are central to comparative genomics studies.
    Trees used in this context are usually annotated with a variety of
    data elements, such as taxonomic information, genome-related data
    (gene names, functional annotations) and gene duplication events, as
    well as information related to the evolutionary tree itself (branch
    lengths, support values). phyloXML is an XML data exchange standard
    that can represent this data. Trees in phyloXML format can be
    displayed and analyzed with Archaeopteryx
    <http://www.phylosoft.org/archaeopteryx/> (the successor to ATV
    <http://bioinformatics.oxfordjournals.org/cgi/content/abstract/17/4/383>),
    which also allows manipulation and navigation of the tree. While
    tools exist to convert other formats (such as the widely used Newick
    and Nexus formats) to phyloXML, there is currently support for
    phyloXML in only one of the open source Bio* projects (in BioPerl
    <http://www.bioperl.org/wiki/Phyloxml_Project_Demo>, as a result of
    Google's Summer of Code 2008). 
Approach 
    Build phyloXML support in the increasingly popular, dynamic, and
    fully objected oriented language Ruby. More specifically, extend the
    open source BioRuby project to support phyloXML (BioRuby 1.3.0 has
    just been released). This will entail (i) the development of objects
    to represent all the elements of phyloXML (sequences, taxonomic
    data, annotations, etc), (ii) the development of a parser to read in
    phyloXML, and (iii) a phyloXML writer. 
Challenges 
    Relating the data elements specific to phyloXML to the tree classes
    already in BioRuby while maintaining the standards of the BioRuby
    project. Development of a time and memory efficient phyloXML parser
    (the parser has to be able to process trees with thousands of
    external nodes, at least). 
Involved toolkits or projects 
    BioRuby <http://www.bioruby.org/>, phyloXML <http://www.phyloxml.org> 
Degree of difficulty and needed skills 
    Medium. Requires experience in an object oriented programming
    language (such as C++, Java, or, ideally, Ruby). Experience in
    genomics or a related biological field is also critical. Knowledge
    of BioRuby will obviously help, as well as familiarity with XML. 
Mentors 
    Christian Zmasek