[Biopython-dev] Ortholog module (InParanoid and RoundUp)

Peter biopython at maubp.freeserve.co.uk
Mon Dec 6 11:00:18 UTC 2010


On Sun, Dec 5, 2010 at 7:42 PM, Andrew Gallant <Andrew.Gallant at tufts.edu> wrote:
> Hello,
>
> I am a graduate student, and for a course project, I wrote an Ortholog
> module for Biopython. It currently provides two orthology database wrappers
> (InParanoid and RoundUp) along with a class hierarchy to contain the data.
>
> It completely implements InParanoid's "gene search," and RoundUp's "browse"
> (gene search) and "retrieve" (clustering) functions, with some rudimentary
> error detection. RoundUp's clustering makes finding all orthologs between a
> set of species very easy.
>
> I haven't contributed to Biopython before, but assuming this module is
> desirable, how might I start that process? I have the changes in a forked
> git repository (which is updated with upstream changes) here [1]. I followed
> the style guide and included doc strings for all functions/modules/classes.
> However, I have *not* written any unit tests yet, but certainly will.
>
> Please let me know if I've missed anything!
>
> Thanks!
> - Andrew Gallant
>
> [1] - https://github.com/BurntSushi/biopython/tree/me

Hi Andrew,

I would suggest adding some very high level introductory text,
perhaps in Bio/Ortholog/__init__.py about what the Bio.Ortholog
module does - offers access to a number of websites to do X.
Something that should make sense to someone like me who is
unfamiliar with InParanoid and RoundUp ;)

Do all these services encourage/condone programmatic access?
If they offer XML then I guess they do, but worth checking. If they
have any usage guidelines, this should also be highlighted in your
documentation. From a quick look at your code I don't think this
applies, but from past experience HTML scrapers are a bad idea
(a long term maintenance headache for one thing).

Unit tests would be a very good idea.  Try to make the tests general
enough to cope with changes in the online datasets (e.g. addition of
more search results). Use the requires internet hook as in
test_SeqIO_online.py so they can be skipped gracefully if the
user is offline, or has requested to run the tests offline. Very easy:

import requires_internet
requires_internet.check()

Peter



More information about the Biopython-dev mailing list