[Biopython] getting the parent of a Clade

Eric Talevich eric.talevich at gmail.com
Tue Nov 2 15:44:36 UTC 2010


On Tue, Nov 2, 2010 at 5:58 AM, Michael Thon <mike.thon at gmail.com> wrote:

> Hi Eric
> >
> > Do you or anyone else want to try plugging that all_parents function into
> your code to see if it helps significantly? If it does, I could add it as a
> Tree/Clade method in the next Biopython release.
> >
>
>
> I can try it - I have a few 1000 trees to parse so any differences in
> performance should be more obvious.
>
> But first, I realized that I should have explained the problem I'm solving
> in more detail, to see if I'm approaching it the right way.  I need to visit
> every node in the tree, and then compare the node to its parent and do some
> calculations.  I'm doing this by writing a recursion that starts with
> tree.clade and then calls itself twice with clade.clade[0] and
> clade.clades[1] .  then within the function I need to get the parent clade
> and do the calculations.
>
> def crunch_clade(tree, clade):
>        compute_data(clade, get_parent(tree, clade)
>        crunch_clade(tree, clade.clades[0])
>        crunch_clade(tree, clade.clades[1])
>
> Is there a better way to do it?  Like maybe starting with the terminal
> clades?
>
> Mike
>
>
The tree traversal functions in Bio.Phylo are fairly efficient and flexible.
I'm not sure if the traversal order matters for your function, but you could
try something like:

parent_lookup = all_parents(tree)  # from the cookbook
for clade in tree.find_clades():
    compute_data(clade, parent_lookup[clade])

Or, possibly:

for parent in tree.get_nonterminals():
    for child in parent:
        compute_data(child, parent)

Notice that get_terminals() and get_nonterminals() are simplified versions
of find_clades(). They return plain lists instead of being  generator
functions, but the filtering arguments aren't as flexible. Also, see the
tutorial section 12.4.1 on traversal:
http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc167

In particular, if you need to do level-order search, it looks like this:

tree.find_clades(order='level')


Hope that helps,
Eric



More information about the Biopython mailing list