[Biopython] GSoC Ortholog Module Proposal

Brad Chapman chapmanb at 50mail.com
Mon Apr 5 12:05:54 UTC 2010


Matthew;
Thanks for the introduction and pointers to your work. Your
http://ortholog.us interface looks like a useful resource; it's
really nice to see web interfaces being developed with programmable
JSON APIs. Out of curiousity, is the code available for what you've
done so far?

> For GSoC I would like to write a module to abstract finding orthologs as
> much as possible. This would greatly simplify creating custom evolutionary
> trees for biologists. The module could fetch orthologs from TreeFam,
> InParanoid, Harvard's Roundup, and Princeton's BLASTO. The module could also
> provide support for producing alignments, concatenating alignments, removing
> sections of gaps, and constructing trees. Ortholog identification could be
> done with no dependency other than an internet connection. Alignments and
> trees would require the user to have the appropriate tools installed.
[...]
> Is there any interest in having such a project? I'd be grateful to get some
> feedback either on or off list.

This is a good project idea and nicely spec'ed out. One additional
direction that might also be worth exploring is using BioMart to
retrieve orthologs from the Ensembl Compara work. Here's a recent
thread on BioStar with the queries to use:

http://biostar.stackexchange.com/questions/569/how-do-i-match-orthologues-in-one-species-to-another-genome-scale

I don't know of Python programming interfaces to BioMart, but there
is a nice R bioconductor library that can be leveraged with Rpy2:

http://www.bioconductor.org/packages/bioc/html/biomaRt.html
http://rpy.sourceforge.net/rpy2.html

For the practical GSoC things, project proposals are due this
Friday, April 9th so time is running short. I'm unfortunately a bit 
over-committed as this point to mentor but hopefully someone will 
be available to step in that role. I'm happy to make suggestions on
the proposal as it comes together.

Thanks,
Brad



More information about the Biopython mailing list