[Biopython-dev] Phylogeny modules for BioPython

Brad Chapman chapmanb at 50mail.com
Wed Apr 8 08:32:26 EDT 2009


Jacob;
Thanks much for your interest in Biopython for Summer of Code; glad
to see a discussion here about your proposal.

Peter's comments are great; I will add to them from the SoC
perspective.

> > I have applied to the Google Summer of Code (12 weeks of
> > working part-time on a programming assignment)

SoC is a full time commitment for the summer. Your proposal also
lists some conflicts (classes, other research) for the summer
months. On your updated proposal you should be explicit about these
and describe how you plan to make up time you miss during the first
two weeks of the quarter.

More generally, your proposal needs a detailed plan of deliverables
on a week to week basis over the project timeline, starting with
coding on May 23rd:

http://socghop.appspot.com/document/show/program/google/gsoc2009/timeline

This is the last hour for refining proposals, so you will need to
update your proposal quickly for us to still have time to consider
it. I would recommend copying your current proposal to a Google Doc,
adding all of the specifics needed, and then submitting a link to
the open document as a comment to your initial proposal.

> Brad Chapman may be willing to mentor a GSoC student, have a look back
> of the recent email discussions here.  In particular, Nick Matzke has
> already expressed some interest in Biogeographical and community
> phylogenetics for Biopython (there is a wiki page on open-bio.org on
> this).

I am definitely willing to help; spots will be very competitive
throughout the program.

Echoing Peter's comments, I would put together a project proposal
that tackles:

- Improving parsing support in Bio.Nexus, based on existing code and
  bug reports, and other suggestions you might have.

- Providing code wrapping for other phylogeny software. Since the
  usefulness of different algorithms depends heavily on the context
  in which it is used, you will not find a consensus about which
  program is most useful. My suggestion is to suggest wrappers for
  several useful programs covering the spectrum of possibilities.
  In additions to the ones you listed, a couple others are:

  RAxML http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm
  FastTree http://www.microbesonline.org/fasttree/index.html

- A higher level API over the parsing and command line program support
  that helps users with specific phylogenetic tasks. Based on your
  experience and input from the Biopython community of users, this
  would have the goal of providing a simple way to do common tasks.
  This should be a combination of code to surround repetitive items,
  and cookbook style documentation to help people with specific
  phylogenetic problems.

Other general suggestions:

- Tests. Please describe your plans to write unit tests for all the
  code your write.

- Documentation. Please do leave time in your project plan to fully
  document using your proposed code.

- Projects 3 and 4, as Peter suggests, are out of the scope of GSoC.
  3, specifically, is more of a research project.

Finally, a few meta-items from your e-mail meant as helpful advice:

> It appears to me that BioPython doesn't have much support for
> phylogeny inference and tools related to phylogeny inference.

I understand this is an attempt to provide motivation for your
proposal, but you should do so in a way that does not disparage the
work of the people you are soliciting advice from. Your request
would be better received if you described it in the context of
improving existing phylogenetic support in Biopython.

> I need three things from the community ASAP:
[...]
> I would like a response quickly

No one likes to be told what to do, much less a group your are
requesting help and hopefully a job from. Again, you should think
about how your phrasing will be interpreted by those reading it.

> Nascent

You twice misspelled this: NESCent. Mistakes happen, but it reflects
badly on your commitment to the project to not be able to spell the
name of the organization you would like to work with. These are the
small things you should be careful and double check.

Thanks again for your interest and looking forward to seeing your
revised project plan,
Brad


More information about the Biopython-dev mailing list