[BioPython] Molecular phylogeny
Rick Ree
rree@oeb.harvard.edu
Tue, 17 Apr 2001 14:33:27 -0400 (EDT)
On Tue, 17 Apr 2001, Andersson, Claes wrote:
> ... I'd like to write some code for UPGMA, neighbours relation etc
> which would fit seamlessly with the biopython suite (are wrappers
> preferred?). Does this sound like a good idea to anybody? Or has it
> already been done? If so, where can I find it?
This is a great idea!
I would think that for non-trivial phylogenetic estimation, pure python
code will be too slow. Instead, a python wrapper around a C extension
module would be the way to go.
At one point I played around with the idea of converting the routines in
the PHYLIP package into a shared library that could be wrapped in python,
and Felsenstein seemed amenable to the idea. I even managed to get
DNAPARS running in this way, sort of.
The problem I encountered is that PHYLIP's code makes heavy use of global
variables, and modifications to make it a shared library (thread-safe,
etc) would be extensive. Of course, someone more proficient than me at C
might find it easy :) Another problem is that the PHYLIP source is a
moving target (albeit a slow-moving one).
Of course, there's a whole whack of source-available phylogeny programs
out there to learn from. The one I've been interested in lately is
MrBayes, which uses Markov-chain Monte Carlo methods for Bayesian
phylogeny inference. Another is Poy, a direct-optimization (i.e.,
simultaneous sequence alignment and tree-building using parsimony)
program, but Poy is mostly written in Ocaml -- hard to grok!
Rick