[Biopython-dev] subprocess and calling application wrappers

Peter biopython at maubp.freeserve.co.uk
Wed Jun 2 11:59:46 UTC 2010


On Wed, Jun 2, 2010 at 12:36 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
> On Tue, Jun 1, 2010 at 4:15 PM, Peter <biopython at maubp.freeserve.co.uk> wrote:
>> On Tue, Jun 1, 2010 at 2:23 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
>>> I'd suggest having an option to not capture stdout and stderr, which
>>> would help users avoid those cases where a program spews a lot to
>>> stdout and it's unwieldy to capture and stick it into a string.
>>
>> We need to avoid any risk of deadlocks, so I guess the safe
>> implementation here would be call subprocess with stdout and
>> stderr sent to dev null.
>
> How does this look? Tested on Mac and Windows:
> http://github.com/peterjc/biopython/tree/app-exec2
>
> Example usage without capturing the output:
>
>    from Bio.Emboss.Applications import WaterCommandline
>    water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True,
>                                 asequence="a.fasta", bsequence="b.fasta")
>    print "About to run:\n%s" % water_cmd
>    return_code = water_cmd()
>    print "Return code: %i" % return_code
>
> Example usage with stdout and stderr capture:
>
>    from Bio.Emboss.Applications import WaterCommandline
>    water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True,
>                                 asequence="a.fasta", bsequence="b.fasta")
>    print "About to run:\n%s" % water_cmd
>    stdout, stderr, return_code = water_cmd(capture=True)
>    print "Return code: %i" % return_code
>    print "Tool output:\n%s" % stdout
>
> Note in this implementation it either returns an integer error level
> (the default) or a tuple of stdout, stderr and the error level return
> code. If we opt for adding methods rather than using __call__
> these could be different methods instead.
>
> Another potentially useful option would be to copy the
> subprocess.check_call() function in Python 2.5+ which verifies
> the return code (error level) is zero and raises an exception if not
> (probably only sensible if not capturing the output?). Maybe this
> could even be the default behaviour?
>
> [I would prefer to keep the interface as simple as possible though,
> less options is better! KISS principle.]

With that in mind, as I mentioned yesterday maybe we should just
update the documentation to suggest using os.system() when you
just need the return code and there is no stdin to worry about:

    import os
    from Bio.Emboss.Applications import WaterCommandline
    water_cmd = WaterCommandline(gapopen=10, gapextend=0.5, stdout=True,
                                 asequence="a.fasta", bsequence="b.fasta")
    print "About to run:\n%s" % water_cmd
    return_code = os.system(water_cmd)
    print "Return code: %i" % return_code

Even if the Python documentation seems to be discouraging it,
using os.system() seems simple, robust, and cross platform. We
could even update the tutorial now and post it online - it should
make some people's lives a little easier.

[Note this is actually a silly example, I should be telling water to
output to a file, not stdout which is then ignored.]

Peter




More information about the Biopython-dev mailing list