Proposal: Checked subprocess calls

List overview All Threads
Download

newer

older

dynamic_dot: Warn when using . for...

Commit policy in stable (8.0)

Chris Angelico

14 Oct 2016 14 Oct '16

12:18 p.m.

Branch: rosuav/process-check-run

Two features added to the Process module. Firstly, a simple wrapper Process.check_run that calls Process.run and throws an error if the exit code isn't 0; and secondly, a means for Process.run() to leave stderr attached to the console. The intention is for this to be used for subprocess invocations that should normally succeed, but which might fail under exceptional circumstances:

https://github.com/Rosuav/shed/blob/master/double_compile.pike

string build = Process.run(({"pike/bin/pike", argv[0], "-x", argv[1]}))->stdout;

The intention is for this to run a process and return its stdout. I could, of course, manually check:

mapping rc = Process.run(({"pike/bin/pike", argv[0], "-x", argv[1]})); if (rc->stderr != "") werror(rc->stderr); if (rc->exitcode) exit(1, "Oops\n");

But with this proposal, the code would look like this:

string build = Process.check_run(({"pike/bin/pike", argv[0], "-x", argv[1]}), (["stderr": "-"]))->stdout;

It's a fairly simple idiom (and it might be worth making check_run set stderr to "-" by default), and 99% of the time, it'll do the same as the equally-simple idiom that I currently use for any quick-and-dirty scripts - but if something goes wrong, check_run will make sure that you don't miss noticing it.

There are, broadly speaking, two ways for a subprocess to signal that something went wrong: a message on stderr, or an exit code. They're handled independently (check_run looks for the exit code, and setting stderr to "-" lets the user notice an error message), so you can pick and choose.

Is this something wanted in trunk?

ChrisA

Show replies by date

Stephen R. van den Berg

15 Oct 15 Oct

9:12 a.m.

Chris Angelico wrote:

...

Two features added to the Process module. Firstly, a simple wrapper Process.check_run that calls Process.run and throws an error if the exit code isn't 0;

This is a bit overkill, I'd say. It's not generic enough to put in the lib. What if you want to check for a certain range of exitcodes instead? If you want this, I'd say you inherit the class and add your own convenience functions to it, but at application level.

...

and secondly, a means for Process.run() to leave stderr attached to the console. The intention is for this to be used

That seems useful, so I'd welcome that. But then support it for stdout too. Then again, maybe this is better off generalised, into requiring/allowing you to specify things like:

({"stdout":Stdio.stdout, "stderr":Stdio.stderr})

instead of the magic "-".

This would support adding different objects too, in order to redirect directly into a file or some other pipe.

-- Stephen.

Chris Angelico

10 a.m.

On Sat, Oct 15, 2016 at 8:12 PM, Stephen R. van den Berg srb@cuci.nl wrote:

...

Chris Angelico wrote:

...
Two features added to the Process module. Firstly, a simple wrapper Process.check_run that calls Process.run and throws an error if the exit code isn't 0;

This is a bit overkill, I'd say. It's not generic enough to put in the lib. What if you want to check for a certain range of exitcodes instead? If you want this, I'd say you inherit the class and add your own convenience functions to it, but at application level.

There's a strong convention that zero == success, nonzero == failure, so this would apply as-is to a lot of programs. Obviously this shouldn't be the one and only way to run a subprocess (this is NOT a proposed change to Process.run, it's a separate function), so if you want to invoke grep(1) and accept 0 (lines found) and 1 (no lines found) but not 2 (error), then you'd make an app-specific function; but 0 vs other is common enough that I think this fits the standard library.

Example of prior art: Python's subprocess.check_call and check_output functions raise an exception on non-zero return value:

https://docs.python.org/3/library/subprocess.html#subprocess.CalledProcessEr...

...

...
and secondly, a means for Process.run() to leave stderr attached to the console. The intention is for this to be used

That seems useful, so I'd welcome that. But then support it for stdout too. Then again, maybe this is better off generalised, into requiring/allowing you to specify things like:

({"stdout":Stdio.stdout, "stderr":Stdio.stderr})

instead of the magic "-".

This would support adding different objects too, in order to redirect directly into a file or some other pipe.

If you don't want to redirect either, or if you want to redirect them both to files, use Process.Process or Process.create_process directly. The advantage of Process.run is that it captures the output. I suppose you might conceivably want to capture stderr but leave stdout attached to the console, but it's a lot less common than "run program, give it input, retrieve output, but if it displays an error, let that be seen". I can't think of any use-cases for the converse. Process.run() is great, but there are a lot of times when I have to either replicate half of its code, or use run() and risk squashing an unexpected error. With check_run, I'd be able to have virtually the same API, but with the declaration that a program error is an exception.

Consider a simple way to get audio file information:

string info = Process.run(({"soxi", "audio_01.wav"}))->stdout; //proceed to parse the given info, eg: sscanf(info, "%*[\n]%{%s: %s\n%}", array lines); mapping fields = (mapping)lines;

If soxi is not available, this will raise an immediate exception, rather than charging on blindly; but if audio_01.wav isn't found, there's no indication of the actual problem - you just don't get any useful output. Using check_run causes an instant failure, saying that the process exited 1; and keeping stderr attached to the console would let the user see the message from soxi:

string info = Process.check_run(({"soxi", "audio_01.wav"}), (["stderr": "-"]))->stdout;

In fact, this usage could itself be wrapped up another level, if desired. I've tossed another commit onto the branch to add a check_output function, but this one is less significant (it's just a one-liner). With check_output, omitting the modifiers mapping gives the natural and obvious behaviour of the above line of code.

ChrisA

Stephen R. van den Berg

16 Oct 16 Oct

9:12 a.m.

Chris Angelico wrote:

...

There's a strong convention that zero == success, nonzero == failure, so this would apply as-is to a lot of programs. Obviously this shouldn't be the one and only way to run a subprocess (this is NOT a proposed change to Process.run, it's a separate function), so if you want to invoke grep(1) and accept 0 (lines found) and 1 (no lines found) but not 2 (error), then you'd make an app-specific function; but 0 vs other is common enough that I think this fits the standard library.

...

Example of prior art: Python's subprocess.check_call and check_output functions raise an exception on non-zero return value:

Well, ok, fair enough. But then, try to improve on the interface and preferably make it work like this: (unless there already is an easy Pike-API for this, I'm not intimately familiar with the Process-group)

// This leaves stdin and stdout and stderr unaltered Process.pipe.run("fgrep -e test").run("sort").run("wc");

string s = "Your.input.text";

// This leaves stdout and stderr unaltered Process.pipe.stdin(s).run("fgrep -e test").run("sort").run("wc");

Stdio.File in = Stdio.File("foo.bar.file"); Stdio.File out = Stdio.FakeFile();

// This leaves stderr unaltered Process.pipe.stdin(in).run("fgrep -e test").run("sort") .run("wc").stdout(out); string output = out->read();

Stdio.File in = Stdio.File("foo.bar.file"); Stdio.File out = Stdio.FakeFile(); Stdio.File devnull = Stdio.File("/dev/null","w");

// This leaves stderr unaltered, except for the "sort" run, we silence it there Process.pipe.stdin(in).run("fgrep -e test").stderr(devnull).run("sort") .run("wc").stdout(out); string output = out->read();

Stdio.File in = Stdio.File("foo.bar.file"); Stdio.File out = Stdio.FakeFile(); Stdio.File devnull = Stdio.File("/dev/null","w"); Stdio.File sortout = Stdio.FakeFile();

// This leaves stderr unaltered, except for the "sort" run, we silence it there // The stdout of sort is copied to both the sortout file, and to the wc // process (compare "man 1 tee") Process.pipe.stdin(in).run("fgrep -e test").stderr(devnull).run("sort") .tee(sortout).run("wc").stdout(out); string sortoutput = sortout->read(); string output = out->read();

In all this, yes, throw exceptions if any of the processes returns non-zero exitcodes. In essence the above interface would allow you to run arbitrarily complex (shell-like) pipes, basically supporting everything bash does too. Maybe the only thing missing here would be the ability to ignore certain signals per remainder of the process train, e.g.:

Process.pipe.stdin(in).blocksignal(SIGHUP).run("fgrep -e test") .stderr(devnull).run("sort").tee(sortout) .unblocksignal(SIGHUP).run("wc").stdout(out);

Which would tie SIGHUP to SIG_IGN for fgrep and sort, and allow it through again for wc.

As an extra convenience function, I could imagine this:

string output = Process.pipe.stdin(in).run("fgrep -e test").run("sort") .run("wc").stdoutstring;

Which would generate that output string you are after in one-go.

-- Stephen.

Chris Angelico

10:08 p.m.

On Sun, Oct 16, 2016 at 8:12 PM, Stephen R. van den Berg srb@cuci.nl wrote:

...

Well, ok, fair enough. But then, try to improve on the interface and preferably make it work like this: (unless there already is an easy Pike-API for this, I'm not intimately familiar with the Process-group)

// This leaves stdin and stdout and stderr unaltered Process.pipe.run("fgrep -e test").run("sort").run("wc");

If Pike were a shell language, this would make sense. But I would much prefer this:

sizeof(Process.check_output("fgreb -e test") / "\n");

Pike isn't primarily about invoking subprocesses; it has a rich set of text processing primitives built-in, so trying to make subprocess chaining smoother is usually a waste of effort.

ChrisA

Stephen R. van den Berg

17 Oct 17 Oct

8:07 p.m.

Chris Angelico wrote:

...

...
// This leaves stdin and stdout and stderr unaltered Process.pipe.run("fgrep -e test").run("sort").run("wc");

...

If Pike were a shell language, this would make sense. But I would much

It would make sense, for any programming language, not only for shell languages.

...

prefer this:

...

sizeof(Process.check_output("fgreb -e test") / "\n");

...

Pike isn't primarily about invoking subprocesses; it has a rich set of text processing primitives built-in, so trying to make subprocess chaining smoother is usually a waste of effort.

If it would be a *lot* of effort, I'd agree. But I'm guessing that it actually is easier to implement than you might expect; and then it makes for a very clutterfree and straightforward way to start one or more (piped) processes.

-- Stephen.

Peter Bortas

6 Nov 6 Nov

8:15 p.m.

We discussed this a bit during the Pike Conferance. These are my thoughts on it:

On Mon, Oct 17, 2016 at 10:07 PM, Stephen R. van den Berg srb@cuci.nl wrote:

...

Chris Angelico wrote:

...
...
// This leaves stdin and stdout and stderr unaltered Process.pipe.run("fgrep -e test").run("sort").run("wc");

...
If Pike were a shell language, this would make sense. But I would much

It would make sense, for any programming language, not only for shell languages.

The pipe-syntax seems interesting, though it probably has some corner cases where you could create objects where several parts of the pipe tries to run at the same time.

...

...
prefer this:

...
sizeof(Process.check_output("fgreb -e test") / "\n");

That's one of those places where we would benefit from an easier to use interface for Process.Process. Process.run sucks up memory for problems that doesn't really need it. It should preferably be streamed and/or auto-iterated in more of the manner of:

count( "\n", Process.check_output("fgreb -e test") );

where this in effect becomes a line or character iterator and only keeps at most one line in memory at the time.

...

...
Pike isn't primarily about invoking subprocesses; it has a rich set of text processing primitives built-in, so trying to make subprocess chaining smoother is usually a waste of effort.

If it would be a *lot* of effort, I'd agree. But I'm guessing that it actually is easier to implement than you might expect; and then it makes for a very clutterfree and straightforward way to start one or more (piped) processes.

The pipe-syntax and Chris's easy to understand convenience function seem orthogonal with Chris's API easier to grasp. So I'd be glad to see versions of both eventually go in. I'm not overly enthused with continuing the tradition I introduced with Process.run to just buffer everything in memory if it can be avoided though.

Chris: What you have seem generally useful, but it lies in a namespace that will get a bit busy if we implement all the special cases as we think of them. I have similar function not checked in that would confuse users if we both committed. No in the least because I find exceptions useless for most smaller scripts unless I just plan on not catching anything. Which means my scripts are full of very similar code but where the functions return 0 on failure, not throwing, but dumping the failure pretty-printed to the console. Something I've also been planning to Process for a while, but not come up with a set of functions that doesn't make it confusing for users to choose among all the stuff.

As I see it there are a few things that should happen in regards to external process spawning:

1. Can we come up with an almost as easy to use API as Process.run that plays better with memory and latency?

The stupidest version of that would be to have the call where you specify the number of bytes or lines to read and then hands back the result together with a function to call for the next portion. That's real stupid, but still preferable to playing with Process() directly and risk a lock-up because you didn't handle your pipes perfectly.

2. Make more convenience-APIs

I count both Stephens pipes and Chris easy-call APIs to this.

(1) have to be thought about before we do (2), because depending on the solution for (1) or we will end up with duplicate or triplicate APIs for the same thing. I promise to pour some real energy into thinking about this in a few weeks when I have time again, but meanwhile please jump in with what everyone's ideas are about what problems you are trying to solve and how you imagine it solved.

Regards,

-- Peter Bortas, NSC

Chris Angelico

10:33 p.m.

On Mon, Nov 7, 2016 at 7:15 AM, Peter Bortas bortas@gmail.com wrote:

...

Chris: What you have seem generally useful, but it lies in a namespace that will get a bit busy if we implement all the special cases as we think of them. I have similar function not checked in that would confuse users if we both committed. No in the least because I find exceptions useless for most smaller scripts unless I just plan on not catching anything. Which means my scripts are full of very similar code but where the functions return 0 on failure, not throwing, but dumping the failure pretty-printed to the console. Something I've also been planning to Process for a while, but not come up with a set of functions that doesn't make it confusing for users to choose among all the stuff.

Since this is of interest, I've rebased the branch onto current 8.1, so we have a clean starting point for discussion.

...

As I see it there are a few things that should happen in regards to external process spawning:

Can we come up with an almost as easy to use API as Process.run

that plays better with memory and latency?

My first thought along those lines is to have something that returns a pipe or buffer for stdout. Maybe a specially-enhanced one that also deals with the return value?

Ideally, it should have a single return value that can be used directly for the most obvious usage, which is receiving stdout. Receiving stderr and/or the return value would ideally be possible, but if that isn't possible, I don't mind a convenience function that you have to set aside when you want more control.

More of a brainstorm or stream-of-consciousness than an actual theory, but that's my thinking.

ChrisA

3194

Age (days ago)

3217

Last active (days ago)

pike-devel@lists.lysator.liu.se

7 comments

3 participants

tags (0)

participants (3)

Chris Angelico
Peter Bortas
Stephen R. van den Berg