[Biopython-dev] [Bug 3046] PhyloXML, please define get/set methods

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu Apr 8 22:10:37 UTC 2010


http://bugzilla.open-bio.org/show_bug.cgi?id=3046





------- Comment #6 from eric.talevich at gmail.com  2010-04-07 01:05 EST -------
(In reply to comment #0)
> It would be nice if there were get/set properties for phyloXML objects that
> were easier and more concise to use.  Right now, to set, say, a phyloXML
> property, one has to read the code to learn the names and arguments of the
> Property class and also to learn that properties are added by appending to a
> list.

Yes, it's easier to tweak the class definitions if there's not much syntactic
sugar to get in the way. This is still pretty new code ;) but of course I'm
open to suggestions.

> Besides the matter of convenience, there is also a question about how the
> properties and taxonomies objects behave.   I will take the matter up with the
> phyloXML mailing list, but I believe that these objects should be
> dictionary-like rather than list-like.  That is, duplicate ref values should
> not be allowed because the question of how to handle duplicates would have to
> get pushed down to the user level and will be inconsistent.

The Events class (clade.events attribute) mimics a dictionary. Have you used
that yet?

About clade.properties:
If ordering of properties doesn't matter, 'ref' is guaranteed to be unique at a
node, and it seems to be the right way to index the other associated data, then
I can make clade.properties act like a dictionary. Can we confirm all of these?

And for the implementation, can you provide a sketch of what you'd like the
final structure to look like, and maybe a contrived doctest-like code example
showing what you'd like to be able to do?

In many cases, the phyloXML spec doesn't currently promise enough to make nice
shortcuts work without the possibility of breaking in the future. For example,
check out this new demo with *two* bootstrap values for every clade:
http://www.phylosoft.org/archaeopteryx/examples/data/multiple_supports.xml

I was tempted to make confidences act like a dictionary indexed by support
type, but clearly now that wouldn't have worked. A list of Confidence objects
lets us stay faithful to the raw XML representation.


> def set_property(self,  *propArgs,  **propkwArgs):
>     for property in self.properties:
>         if property.ref == propArgs[1]:
>             property = PhyloXML.Property(propArgs)
>             return
>     self.properties.append(PhyloXML.Property(*propArgs,  **propkwArgs))
> 
> def get_property(self,  key):
>     for property in self.properties:
>         if property.ref == key:
>             return property.value
>     raise KeyError

It's possible that Bio.Phylo will pick up the convention of "add_foo/get_foo"
methods where a property would be overly magical, and something noteworthy is
going on internally. Alignment objects have "add_sequence", and Phylogeny
objects have "get_alignment". Would you use a Phylogeny method called
add_alignment, taking something like a Phylip character matrix?

We can figure out a sugared interface for clade.properties once we know how
which of the requirements stated above will actually be guaranteed.


> def set_ID(self,  *idArgs,  **idkwArgs):
>     self.node_id = PhyloXML.Id(*idArgs,  **idkwArgs)

If you do "from Bio.Phylo import PhyloXML as PX" it really doesn't save any
typing, and the **kwargs magic is even less suitable for introspection.

It's not possible to take advantage of all the PhyloXML annotations available
without learning about the annotation classes the


------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk  2010-04-08 18:10 EST -------
(In reply to comment #6)
> 
> It's possible that Bio.Phylo will pick up the convention of "add_foo/get_foo"
> methods where a property would be overly magical, and something noteworthy is
> going on internally. Alignment objects have "add_sequence", and Phylogeny
> objects have "get_alignment". Would you use a Phylogeny method called
> add_alignment, taking something like a Phylip character matrix?
> 

Note that while the "old" Alignment object has an add_sequence method, it is
now tagged as obsolete with the "new" Alignment object in Biopython 1.54
(instead you append a SeqRecord).

Regarding PhyloXML, would it fit to rename "get_alignment" as "to_alignment"?
That is a fairly common naming convention.


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list