[Biopython-dev] Properties names in command line wrappers

Peter biopython at maubp.freeserve.co.uk
Wed May 13 07:15:35 EDT 2009


On Wed, May 13, 2009 at 11:50 AM, Cymon Cox <cy at cymon.org> wrote:
>> On Tue, May 5, 2009, Peter wrote:
>> > ...
>> > I favour using only a single property for each parameter, with the
>> > name as similar as possible to the actual command line switch (i.e.
>> > property name "a" for "-a", not "nprocessors").  Note each property
>> > would have a docstring which will say what is it for ("Number of
>> > processors to use.").
>>
>> I still favour only using a single python property for each parameter,
>
> A confusing issue arises where we have alternative names for options.
> That the following example from _Probcons.py:
>
>             _Option(["-c", "c", "--consistency", "consistency" ], ["input"],
>                     lambda x: x in range(0,6),
>                     0,
>                     "Use 0 <= REPS <= 5 (default: 2) passes of consistency
> transformation",
>                     0),
>
>>>> cmd = cmdline = ProbconsCommandline("probcons", input="blah")
>>>> cmd.c = 1
>>>> str(cmd)
> 'probcons blah '
>>>> cmd.set_parameter("c", 1)
>>>> str(cmd)
> 'probcons -c 1 blah '
>>>> cmd.consistency = 2
>>>> str(cmd)
> 'probcons -c 2 blah '
>>>> cmd.c = 5
>>>> str(cmd)
> 'probcons -c 2 blah '
>
> That is, the user needs to look at the code to figure out what the correct
> name is to use when assigning to the property. Is it possible to restrict
> the binding of attributes to the cmdline to only valid property names? An
> alternative would be to restrict all parameters to only one name and
> document the alternatives it covers (dont like this idea - see below).

Yes, you can use any of the defined aliases with set_parameter, and
they are all equally valid, and all do exactly the same thing.  e.g.

cmd = ProbconsCommandline("probcons", input="blah")
cmd.set_parameter("c", 1)
cmd.set_parameter("-c", 1)
cmd.set_parameter("--consistency", 1)
cmd.set_parameter("consistency", 1)

I would however regard set_parameter as a legacy method and
push the (single) keyword argument or property alternative, for
which there is only one name (here "consistency" ):

cmd = ProbconsCommandline("probcons", input="blah")
cmd.consistency = 1

or,

cmd = ProbconsCommandline("probcons", input="blah", consistency=1)

[And yes, we should have some error checking code in the base
class __init__ method to make sure the string used is a valid python
identifier.]

The user does NOT have to look at the source code to find this out -
just the docstrings or properties - try help(cmd) or dir(cmd) in python.

>> but after some work on the blastall wrapper last night, I am
>> beginning to come round to your point of view.
>>
>> If a command line tool provides a long parameter name (some tools
>> provide both short and long names for important parameters) we
>> should use that rather than inventing our own [so no change here].
>>
>> However, for tools like BLAST which *only* have cryptic single letter
>> command line options (case sensitive), maybe we should be using
>> a sensible human readable name for the associated property in the
>> Biopython wrapper (i.e. "nprocessors" for "-a", and "window_size"
>> for "-A").  Having actually now tried using properties "a" and "A",
>> the resulting python code is very cryptic - and only makes sense
>> if you are familiar with the blastall arguments (and given there are
>> so many of them, this is difficult!).
>
> I dont agree. If you want to make your python code legible to people
> who are not familar with the command line options, you can just
> comment it. I think the interfaces should stick as close as possible
> to the application documentation. I see these interfaces being used
> mostly by people who are familar with the applications, in which case
> the command line construction should be fairly intuitive.

Well, I am on the fence here.  The trouble is that sometimes (e.g. BLAST)
the command line parameters themselves are just so cryptic.  Yes, we
could just use "a" and "A", and leave it up to the user to document their
code.  If we using "nprocessors" and "window_size" the code becomes
self documenting (although you have to know Biopython's mapping).

Brad's suggestion to support both in the property and keyword
arguments brings us back to having multiple choices on how to do
set a parameter (as in the set_parameter with its aliases), confusing
and unpythonic.

Peter



More information about the Biopython-dev mailing list