[Biopython-dev] ApplicationResult and generic_run obsolete?

Peter biopython at maubp.freeserve.co.uk
Mon Jul 6 19:02:56 UTC 2009


Hi all,

There were many things I discussed with Biopython folks at BOSC 2009,
and one of these was a conversation with Brad about some of
Bio.Application - specifically the idea behind the ApplicationResult
object. We basically agreed this was superfluous and could be
deprecated. The only thing I've found useful in this object is the
return code (an integer) when using Bio.Application.generic_run (which
in itself seems a bit superfluous).

Now, declaring ApplicationResult obsolete for Biopython 1.51 (with a
deprecation in the following release) is fine except for the fact that
this object gets used in the function generic_run. So we'd have to
obsolete that too. [If anyone can see any other side effects of
deprecating Bio.Application.ApplicationResult please speak up]

Right now, generic_run waits for the sub-process to finish, and
returns a tuple of:
* An ApplicationResult object holding the return code (and a few other
things which can also be found from the command line string object,
like the expected output filenames).
* Standard output as a StringIO handle (could be memory hungry!)
* Standard error as a StringIO handle (could be memory hungry!)

Personally when running a sub-process I have either wanted the stdout
(and stderr) handles, OR the return code (and I don't have about
stdout and stderr). I can't think of a situation off hand where I
needed both. So for me, the Bio.Application.generic_run function isn't
very helpful.

In Python, there are several ways to run a tool, starting with
something very simple like os.system(...) which will run and block
until the task finished, returning the return code (with some provisos
on Windows). Next, there were a whole set of popen*() functions which
generally returned handles. These are now all obsolete with Python
2.6, and subprocess should be used instead.

If we want to deprecate Bio.Application.generic_run (in order to
deprecate Bio.Application.ApplicationResult), then do we need a
replacement? Or replacements?

Possible helper functions that come to mind are:
(a) Returns the return code (integer) only. This would basically be a
cross-platfrom version of os.system using the subprocess module
internally.
(b) Returns the return code (integer) plus the stdout and stderr
(which would have to be StringIO handles, with the data in memory).
This would be a direct replacement for the current
Bio.Application.generic_run function.
(c) Returns the stdout (and stderr) handles. This basically is
recreating a deprecated Python popen*() function, which seems silly.

However, I'm tempted to say Biopython shouldn't be duplicating basic
Python functionality, like wrapping the subprocess module in helper
functions for typical situations. Instead we should just document
using the current recommend Python best practice (which I believe to
be use the subprocess module). The downside is that using subprocess
is a bit tricky for novices.

Any thoughts?

Peter



More information about the Biopython-dev mailing list