[Biopython-dev] ApplicationResult and generic_run obsolete?

Peter biopython at maubp.freeserve.co.uk
Thu Aug 6 14:39:33 UTC 2009


On Tue, Aug 4, 2009 at 8:29 PM, Peter<biopython at maubp.freeserve.co.uk> wrote:
> On Thu, Jul 9, 2009 at 10:18 AM, Peter<biopython at maubp.freeserve.co.uk> wrote:
>> On Wed, Jul 8, 2009 at 2:06 PM, Brad Chapman<chapmanb at 50mail.com> wrote:
>>> How about adding a function like "run_arguments" to the
>>> commandlines that returns the commandline as a list.
>>
>> That would be a simple alternative to my vague idea "Maybe we
>> can make the command line wrapper object more list like to make
>> subprocess happy without needing to create a string?", which may
>> not be possible. Either way, this will require a bit of work on the
>> Bio.Application parameter objects...
>
> By defining an __iter__ method, we can make the Biopython
> application wrapper object sufficiently list-like that it can be
> passed directly to subprocess. I think I have something working
> (only tested on Linux so far), at least for the case where none
> of the arguments have spaces or quotes in them.

The current Bio.Application code works around generating command line
strings, and works fine cross platform. Making the Bio.Application
objects "list like" and getting this to work cross platform isn't
looking easy. Spaces on Windows are causing me big headaches.

Switching to lists of arguments appears to work fine on Unix
(specifically tested on Linux and Mac OS X), but things are more
complicated Windows. Basically using an array/list of arguments is
normal on Unix, but on Windows things get passed as strings. The
upshot is different Windows tools (or libraries used to compile them)
have to parse their command line string themselves, so different tools
do it differently. The result is you *may* need to adopt different
spaces/quotes escaping for different command line tools on Windows.

Now, if you give subprocess a list, on Windows it must first be turned
into a string, before subprocess can use the Windows API to run it.
The subprocess function list2cmdline does this, but the conventions it
follows are not universal.

I have examples of working command line strings for ClustalW and PRANK
where both the executable and some of the arguments have spaces in
them. It seems the quoting I was using to make ClustalW (or PRANK)
happy cannot be achieved via subprocess.list2cmdline (and I suspect
this applies to other tools too).

I will try and look into this further. However, even if it is
possible, I don't think we can implement the list approach in time for
Biopython 1.51, as there are just too many potential pitfalls.

I have in the meantime extended the command line tool unit tests
somewhat to include more examples with spaces in the filenames

[I'm beginning to think replacing Bio.Application.generic_run with a
simpler helper function would be easier in the short term, continuing
to just using a string with subprocess, but haven't given up yet.]

Peter



More information about the Biopython-dev mailing list