[Biopython-dev] Properties names in command line wrappers

Peter biopython at maubp.freeserve.co.uk
Mon May 4 14:53:37 UTC 2009


On Mon, May 4, 2009 at 3:48 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Thu, Apr 30, 2009 at 1:05 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>> I love what you are doing here. The keywords and properties make
>> it much more Pythonic; the old way reeks of Java-style get/sets. My
>> vote is to put them both in.
>
> Cool - I was hoping people would agree it is more pythonic.
>
> I have some follow up thoughts, or points for discussion ...
>

I updated the patch on Bug 2822 to cover all the Bio.Application
command line wrapper subclasses, and included __repr__ support.
However, that has raised a real example of a parameter where the
current "human readable" name is not a valid python identifier ("in",
for "-in" in Muscle).  I think the pragmatic solution is to add a
sensible alternative which we can use for the property and keyword
argument name (e.g. "input" in this case) while in general keeping
these names as close as possible to the actual parameter name as used
at the command line.

On the other hand, some might argue for giving all the options
meaningful names.  The (hardly used) existing blastall wrapper in
Bio/Blast/Applications.py gives the "-a" argument a human readable
name of "nprocessors", and "-A" gets "window_size". With the old
set_parameter call either alias could be used.  However, with a python
property we need to pick one as a preferred name - and I'm not 100%
sure being helpful and using "nprocessors" (e.g. cline.nprocessors=4)
is actually better than using the actual argument name (e.g. cline.a =
4).

My instinct is that these are low level wrappers, which don't try to
second guess the user.  To take full advantage of any command line
tool you will need to read the tool's documentation to know what the
arguments are - and having Biopython making up its own aliases just
makes things more complicated.  Therefore I think the property names
in the command line wrapper objects should be as close as possible to
the actual command line arguments.  In this case, for blastall use "a"
for number of processors and "A" for window size.

However, I see the existing "helper functions" in
Bio/Blast/NCBIStandalone.py as a higher level wrapper, which tries to
insulate the user from the precise details of the command line string,
and here using an argument name "nprocessors" makes more sense
(although again, it differs from the actual command line making cross
referencing to the NCBI documentation more difficult).

What are your thoughts Brad?

Peter



More information about the Biopython-dev mailing list