[Biopython-dev] PhyloXML helper functions

Brad Chapman chapmanb at 50mail.com
Tue Jul 7 12:51:52 UTC 2009


Hi Eric;

> I've been mulling a couple of methods for PhyloXML objects that I thought
> could deserve some discussion.
> 
> 1. Singular properties for some plural attributes
> 
> This goes back to the "confidences" issue: When I'm drilling down through a
> phyloXML-derived tree, I keep expecting certain attributes to be singular
> values when they're actually plural. Auto-completion catches it, of course,
> but the resulting code would seem more obvious if I used the singular name
> when I know the attribute consists of a list of one element.

I like the idea and implementation for cases where you can have
multiple items, but have one most of the time. Very nice.

> 2. A find() method on Clade and maybe Phylogeny objects
[...]
> Enhancements:
> - The keyword argument could be a regular expression. Would that be useful?

This seems useful. Often people use crazy naming convention hacks,
and might want to pull out something like all proteins from a
particular organism based on a common prefix in the name.

> To handle numbers, I'd have to convert every sub-node attribute value to a
> string, and that would be weird -- or else find() would have to skip
> numerical attributes.

Is this if you support regular expressions or either way? For the
find, I think it's sufficient to define what you support and leave
it at that set: any subset of searching will help people get their
work done.

> - If no regular arguments are needed, cls could default to PhyloElement or
> even "object" to match everything.

I like the object default here. This fits with a simple use case of:
find everything that matches this string of interest.

> - To enable arbitrary hairiness, this function could accept a function as
> the value of the keyword argument and return anything truthy. But at that
> point, the user could probably just roll their own find_node() function.
> However, it could still be useful to filter for numerical values.

This is probably more than you need. For complicated cases I'd
assume people are sophisticated enough to roll their own.

Nice ideas,
Brad



More information about the Biopython-dev mailing list