[Biopython-dev] [Wg-phyloinformatics] BioGeography update

Brad Chapman chapmanb at 50mail.com
Wed Jul 8 08:48:41 EDT 2009


Hi all;

> > 1. Let the Bio.PhyloXML.Tree objects be a superset of everything needed
> > by any phylogenetic tree representation, ever. (It's already pretty close.)
> > Refactor Nexus and Newick to use these objects; merge the features of
> > lagrange so the rest of the Biopython environment can benefit.

I am for this approach. It sounds like what people want is a tree
that does everything, and re-implementations occur because
representations are lacking in something.

It would be nice to design this modularly -- with mixin classes for
related add-on functionality -- as much as possible. This would
allow lighter weight implementations in the future if that were
desired.

> The benefit of letting the tree object structures diverge is procrastination
> -- we could reconcile the two modules after GSoC is over, with stable
> features and test suites in place. But I could justifiably focus on
> integration for the remaining weeks if that's best for Biopython, since
> otherwise I'd probably be reimplementing a number of features already
> present in other modules.

My vote is for the integration work. Refactoring is hard work and
best done early. It is easier to add functionality to a fully integrated
PhyloXML parser in the future.

> I bet this could be done without different objects. Bio.PhyloXML.Tree could
> be moved to Bio.Tree or Bio.Tree.Elements; the base class PhyloElement could
> be renamed to TreeElement; and the Nexus and Newick parsers could reuse
> PhyloXML's Phylogeny and Clade elements, where Clade merges with the
> existing Node class(es). Even Clade by itself might be enough. For
> organizational purposes, format-specific tree elements could move to their
> own files (Bio.Tree.PhyloElement.py, Bio.Tree.NexusElement.py), or some
> multiple-inheritance tricks could be used to smooth things over.

Yes, this sounds exactly right. Great stuff.

> (I know nothing
> about NeXML; should we keep an eye on that too? Glance at the homepage I
> don't see much about complex annotation types, which is probably good if we
> want to fit that format into this framework eventually.)

PhyloXML plus Nexus/Newick is probably enough to stay reasonably
general and keep our sanity. NeXML support would be great but
practically is an additional project. The refactoring you've described
is a good chunk to run with.

Brad


More information about the Biopython-dev mailing list