[BioPython] Molecular phylogeny

Rick Ree rree@oeb.harvard.edu
Tue, 17 Apr 2001 14:33:27 -0400 (EDT)


On Tue, 17 Apr 2001, Andersson, Claes wrote:

> ... I'd like to write some code for UPGMA, neighbours relation etc
> which would fit seamlessly with the biopython suite (are wrappers
> preferred?). Does this sound like a good idea to anybody? Or has it
> already been done? If so, where can I find it?

This is a great idea!

I would think that for non-trivial phylogenetic estimation, pure python
code will be too slow.  Instead, a python wrapper around a C extension
module would be the way to go.

At one point I played around with the idea of converting the routines in
the PHYLIP package into a shared library that could be wrapped in python,
and Felsenstein seemed amenable to the idea.  I even managed to get
DNAPARS running in this way, sort of.

The problem I encountered is that PHYLIP's code makes heavy use of global
variables, and modifications to make it a shared library (thread-safe,
etc) would be extensive.  Of course, someone more proficient than me at C
might find it easy :)  Another problem is that the PHYLIP source is a
moving target (albeit a slow-moving one).

Of course, there's a whole whack of source-available phylogeny programs
out there to learn from.  The one I've been interested in lately is
MrBayes, which uses Markov-chain Monte Carlo methods for Bayesian
phylogeny inference.  Another is Poy, a direct-optimization (i.e.,
simultaneous sequence alignment and tree-building using parsimony)
program, but Poy is mostly written in Ocaml -- hard to grok!

Rick