[Biopython-dev] [Wg-phyloinformatics] BioGeography update

Eric Talevich eric.talevich at gmail.com
Thu Jul 9 19:46:53 UTC 2009


On Wed, Jul 8, 2009 at 8:48 AM, Brad Chapman <chapmanb at 50mail.com> wrote:

> Hi all;
>
> > > 1. Let the Bio.PhyloXML.Tree objects be a superset of everything needed
> > > by any phylogenetic tree representation, ever. (It's already pretty
> close.)
> > > Refactor Nexus and Newick to use these objects; merge the features of
> > > lagrange so the rest of the Biopython environment can benefit.
>
> I am for this approach. It sounds like what people want is a tree
> that does everything, and re-implementations occur because
> representations are lacking in something.
>
> It would be nice to design this modularly -- with mixin classes for
> related add-on functionality -- as much as possible. This would
> allow lighter weight implementations in the future if that were
> desired.
>

OK. Here's the current file layout that needs merging, to illustrate:

Bio/
    PhyloXML/
        __init__.py -- flat public API
        Tree.py
        Parser.py
        Writer.py
        Utils.py
        Exceptions.py
    Nexus/
        Nexus.py
        Nodes.py
        Trees.py
        cnexus.c

The proposal is to extract the Tree class hierarchy so that other modules
can share it, and Biopython users can do I/O with trees as easily as they
currently can with sequences ("from Bio import TreeIO; for tree in
TreeIO.parse('example.xml', 'phyloxml'): ...").

Bio/
    Tree/
        Elements.py
    TreeIO.py   -- read, write wrappers
    PhyloXML/
        Parser.py
        Writer.py
        Utils.py
    Nexus/
        Nexus.py
        cnexus.c

In the above case, TreeIO.py is a new file containing wrappers for the read
and parse functions in my PhyloXML module, and also Nexus and Newick,
pending integration. The modules implementing each specific format remain
where they are, under Bio/, but aren't expected to be imported directly by
the end user.

Alternatively, the individual modules that implement each format for I/O can
be collected under a new TreeIO directory, with __init__ implementing the
wrappers:

Bio/
    Tree/
        Elements.py
        Utils.py?
    TreeIO/
        __init__.py -- read, write wrappers
        PhyloXML.py -- Parser + Writer combined
        Nexus.py
        cnexus.c
        ...

What do you think? Should I start writing a generalized Bio/Tree/Elements.py
for PhyloXML to depend on?

-Eric



More information about the Biopython-dev mailing list