[Biopython-dev] [Wg-phyloinformatics] BioGeography update/BioPython tree module discussion

Brad Chapman chapmanb at 50mail.com
Tue Aug 4 18:27:31 EDT 2009


Hi Nick;
Thanks for the update -- great to see things moving along.

> - removed any reliance on lagrange tree module, refactored all phylogeny 
> code to use the revised Bio.Nexus.Tree module

Awesome -- glad this worked for you. Are the lagrange_* files in
Bio.Geography still necessary? If not, we should remove them from
the repository to clean things up.

More generally, it would be really helpful if we could do a bit of
housekeeping on the repository. The Geography namespace has a lot of
things in it which belong in different parts of the tree:

- The test code should move to the 'Tests' directory as a set of
  test_Geography* files that we can use for unit testing the code.

- Similarly there are a lot of data files in there which are
  appear to be test related; these could move to Tests/Geography

- What is happening with the Nodes_v2 and Treesv2 files? They look
  like duplicates of the Nexus Nodes and Trees with some changes.
  Could we roll those changes into the main Nexus code to avoid
  duplication?

> - Code dealing with GBIF xml output completely refactored into the 
> following classes:
> 
> * ObsRecs (observation records & search results/summary)
> * ObsRec (an individual observation record)
> * XmlString (functions for cleaning xml returned by Gbif)
> * GbifXml (extention of capabilities for ElementTree xml trees, parsed 
> from GBIF xml returns.

I'm agreed with Hilmar -- the user classes would probably benefit from expanded
naming. There is a art to naming to get them somewhere between the hideous 
RidicuouslyLongNamesWithEverythingSpecified names and short truncated names.
Specifically, you've got a lot of filler in the names -- dbfUtils,
geogUtils, shpUtils. The Utils probably doesn't tell the user much
and makes all of the names sort of blend together, just as the Rec/Recs 
pluralization hides a quite large difference in what the classes hold.
Something like Observation and ObservationSearchResult would make it
clear immediately what they do and the information they hold.

> This week:

What are your thoughts on documentation? As a naive user of these
tools without much experience with the formats, I could offer better
feedback if I had an idea of the public APIs and how they are
expected to be used. Moreover, cookbook and API documentation is something 
we will definitely need to integrate into Biopython. How does this fit 
in your timeline for the remaining weeks?

Thanks again. Hope this helps,
Brad


More information about the Biopython-dev mailing list