[Biopython-dev] Code review request for phyloxml branch

Peter Cock p.j.a.cock at googlemail.com
Fri Jan 8 12:00:12 EST 2010


On Fri, Jan 8, 2010 at 4:26 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
> I am not an expert in this area, but the code looks very well done and well
> organized. Thanks, Eric!
>
> I have one suggestion though:
> In the current layout, there's a Bio.Tree and a Bio.TreeIO module. I'd rather
> have everything under Bio.Tree. This makes it easier to understand what each
> Bio.* module is about, and also agrees with the structure of the other modules
> in Biopython. The only exception is Bio.Seq, for which there is a closely related
> Bio.SeqIO and Bio.SeqRecord. (In my opinion, that is more for historical reasons;
> I'd rather have a single Bio.Seq there too).

There is also Bio.AlignIO, which again might have been handled via Bio.Align
with hindsight. One reason for this choice of naming (SeqIO and AlignIO) was
following the lead from BioPerl. I think there are some good points about making
the code for the common object (tree, SeqRecord, Alignment) clearly separate
from the code for parsing or writing it (although separate top level modules is
perhaps overkill). However, I agree, this isn't universal in Biopython (e.g.
Bio.Motif handles a range of motif file formats but there is no Bio.MotifIO).

So I'm somewhat on the fence about the Bio.TreeIO name. However, one thing
I don't like is that "Tree" could mean a class or a module (also a problem with
other Biopython bits like "Seq", "SeqRecord", "Nexus"). Current Python
convention (PEP8) is to use lower case for the module ("tree") and title case
for the class ("Tree"), something most of Biopython does not follow (and
which we can't change without a lot of upheaval). Another option if we want
to try and keep the existing module name style might be Bio.Trees containing
a Tree class, or perhaps something different like Bio.Phylo instead?

Peter


More information about the Biopython-dev mailing list