[Biopython-dev] [Bug 3045] TreeMixin, please define enumerator and other convenience methods
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Sat Apr 10 04:10:39 UTC 2010
http://bugzilla.open-bio.org/show_bug.cgi?id=3045
------- Comment #5 from eric.talevich at gmail.com 2010-04-10 00:10 EST -------
(In reply to comment #4, myself)
> (In reply to comment #0, Joel)
> > (1) internal nodes, terminal nodes, and all nodes are not currently
> > on an equal footing with respect to methods
>
> We could also have 'get_nonterminal' and 'get_all_clades' -- I'm not so sure
> that the last one is useful enough to justify cluttering the API further; what
> do you think? (I actually balked at add get_terminals() originally, since it's
> so simple.)
I added get_nonterminals() to TreeMixin:
http://github.com/biopython/biopython/commit/de024f7d700a8ce83a64bc9f8cfd6273cefe95bc
Do we need a get_all_clades method? Is that a good name?
> > Here I give some convenience methods that I wish were defined in
> > TreeMixin. I have tested them as standalone methods. I hope you'll
> > see fit to include them at some point.
> >
> > def count_internals(self):
> > """Counts the number of non-terminal (internal) nodes within this tree."""
> > return [i for i,e in enumerate_internals(self)][-1] + 1
>
> I can add a convenience function that would help:
>
> def iterlen(items):
> for i, x in enumerate(items):
> count = i
> return count + 1
>
> Then count_internals(tree) is the same as:
> iterlen(tree.find_clades(terminal=False))
>
> Or, if we add get_nonterminals() it's easy:
> len(tree.get_nonterminals())
Both of these can be done now, but len(tree.get_nonterminals()) is easiest.
iterlen() is hidden in _sugar.py for now:
http://github.com/biopython/biopython/commit/c8ce7f7b0314b54084b62759b1f82488374cae28
> > Less critical but still useful are the following two methods (and one private
> > utility) that I find useful for operations on trees:
> >
> > def is_semipreterminal(self):
> > """True if any direct descendent is terminal."""
> > if self.root.is_terminal():
> > return False
> > for clade in self.clades:
> > if clade.is_terminal():
> > return True
> > return False
>
> Is semipreterminal a standard name for nodes like this?
>
> In Python 2.5 and later, you could also do:
> any(clade.is_terminal() for clade in self)
>
>
> > def terminal_neighbor_dists(self):
> > """Return a list of distances between adjacent terminals"""
> > return [self.distance(*i) for i in
> > _generate_pairs(self.find_clades(terminal=True))]
> >
> > def _generate_pairs(self):
> > import itertools
> > pairs = itertools.tee(self)
> > pairs[1].next()
> > return itertools.izip(pairs[0], pairs[1])
I'll add these to the wiki as cookbook entries.
One more thing -- should we rename the find_all and find_clades methods? I'm
leaving this bug open as a reminder to decide that (and the get_all_clades
question above) before the 1.54 release.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list