Hi all,
I’m sometimes annoyed by the trampoline garbage that results when referencing variables in the surrounding scope from a lambda function (“closure”). Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
void foo() { string var1; mapping var2; string bar = “whatever";
function f = lambda(string arg) { write("%O, %O\n", bar, arg); };
… other code that prevents tail-call optimization ...
return f; }
int main(int argc, array argv) { function f = foo(); f(“gazonk”); f = 0; // The lambda and the variables referenced in the frame used to execute “foo” are now garbage. }
A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2) { return lambda(string arg) { write("%O, %O, %O\n”, var1, var2, arg); }; }(var1, var2);
However, this is pretty verbose. Maybe it would make sense to introduce some kind of syntactic sugar, for example:
lambda[var1, var2](string arg) {
};
This could perhaps translate to the construct above, and I think this syntax would map pretty well to how explicitly captured variables work in other languages. Some languages allow renaming of captured variables – not sure if it’s necessary or not. "lambda[myvar1 = var1, myvar2 = var2]() {}" or something.
Ideally, variables from the surrounding scope not present in the capture list should not be usable in the lambda, to avoid unintended trampolines. Obviously, this would follow the same pass-by-value/pass-by-reference semantics of the various Pike types as in function arguments, which I personally think is a good thing. (A special case would be "lambda[](string arg) { }” which would prevent referencing any variables from surrounding scopes at all.)
What do you all think?
/Marty
On Thu, Oct 27, 2016 at 6:08 AM, Martin Karlgren marty@roxen.com wrote:
A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2) { return lambda(string arg) { write("%O, %O, %O\n”, var1, var2, arg); }; }(var1, var2);
However, this is pretty verbose.
More significantly, this is *early binding* semantics. It captures the current values of var1 and var2, and won't notice any other changes.
Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
This, however, could be changed - it's a simple question of optimization, so it comes down to "is it worth it". There's no semantic change there, AFAIK.
ChrisA
26 okt. 2016 kl. 21:23 skrev Chris Angelico rosuav@gmail.com:
On Thu, Oct 27, 2016 at 6:08 AM, Martin Karlgren marty@roxen.com wrote: A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2) { return lambda(string arg) { write("%O, %O, %O\n”, var1, var2, arg); }; }(var1, var2);
However, this is pretty verbose.
More significantly, this is *early binding* semantics. It captures the current values of var1 and var2, and won't notice any other changes.
Yep – don't know about other people but I don't think I've ever really wanted "late binding", so I think that's a good thing.
Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
This, however, could be changed - it's a simple question of optimization, so it comes down to "is it worth it". There's no semantic change there, AFAIK.
Correct, but I'm not sure if that theoretically simple optimization is feasible in practice.
/Marty
On Thu, Oct 27, 2016 at 7:12 AM, Martin Karlgren marty@roxen.com wrote:
26 okt. 2016 kl. 21:23 skrev Chris Angelico rosuav@gmail.com:
On Thu, Oct 27, 2016 at 6:08 AM, Martin Karlgren marty@roxen.com wrote: A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2) { return lambda(string arg) { write("%O, %O, %O\n”, var1, var2, arg); }; }(var1, var2);
However, this is pretty verbose.
More significantly, this is *early binding* semantics. It captures the current values of var1 and var2, and won't notice any other changes.
Yep – don't know about other people but I don't think I've ever really wanted "late binding", so I think that's a good thing.
I have, often. It's also the same semantics as most other languages have for their closures. The most normal way to work with closures should be late-binding and writeable.
Whether it's worth having a "snapshot" syntax that is strictly syntactic sugar for the above, now, that's a separate question. In the times where you *want* late binding, this is the one obvious way to do it, and as you say, it's pretty verbose.
ChrisA
26 okt. 2016 kl. 22:19 skrev Chris Angelico rosuav@gmail.com:
On Thu, Oct 27, 2016 at 7:12 AM, Martin Karlgren marty@roxen.com wrote:
26 okt. 2016 kl. 21:23 skrev Chris Angelico rosuav@gmail.com:
On Thu, Oct 27, 2016 at 6:08 AM, Martin Karlgren marty@roxen.com wrote: A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2) { return lambda(string arg) { write("%O, %O, %O\n”, var1, var2, arg); }; }(var1, var2);
However, this is pretty verbose.
More significantly, this is *early binding* semantics. It captures the current values of var1 and var2, and won't notice any other changes.
Yep – don't know about other people but I don't think I've ever really wanted "late binding", so I think that's a good thing.
I have, often. It's also the same semantics as most other languages have for their closures. The most normal way to work with closures should be late-binding and writeable.
Alright. After looking at the syntax of a couple of other languages, though, it seems that [var1] is most often used for "by value" bindings and [&var1] for "by reference".
Possibly, the [&var1] syntax could be allowed in Pike too, to get the binding semantics of today's lambda but with explicit enumeration of captured variables (with compiler errors if other variables are referenced, otherwise it'd be useless). Cutting the reference to the frame (holding refs to all the other local variables) is probably a different story though.
/Marty
Chris Angelico wrote:
Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
This, however, could be changed - it's a simple question of optimization, so it comes down to "is it worth it". There's no semantic change there, AFAIK.
I'd expect a good optimiser to spot this and to optimise accordingly.
Hi,
On 27 Oct 2016, at 20:32 , Stephen R. van den Berg srb@cuci.nl wrote:
Chris Angelico wrote:
Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
This, however, could be changed - it's a simple question of optimization, so it comes down to "is it worth it". There's no semantic change there, AFAIK.
I'd expect a good optimiser to spot this and to optimise accordingly.
I got bitten by trampoline garbage again and decided to give it a shot – pushed on branch marty/lambdaopt, and it seems to fix my test case (as in no more trampoline garbage). Would be great if someone with compiler insights could make a code review (and hopefully it can be merged to 8.1 eventually).
Btw: is 8.1 officially on C99 now? I’ve rarely seen line comments or inline variable declarations in commits, but maybe it’s just old habit. :-) )
/Marty
On Mon, Oct 31, 2016 at 8:53 AM, Martin Karlgren marty@roxen.com wrote:
Btw: is 8.1 officially on C99 now? I’ve rarely seen line comments or inline variable declarations in commits, but maybe it’s just old habit. :-) )
I'd like to know this too, actually. The code I'm currently working on for GTK2 will benefit significantly from C99's variable-length arrays. I'm coding it to use them, but if it turns out I can't use C99, I'll have to use heap memory instead (or just dump a big array onto the stack and then check for a max length).
ChrisA
Btw: is 8.1 officially on C99 now? Iâve rarely seen line comments or inline variable declarations in commits, but maybe itâs just old habit. :-) )
On the libc side, yes. I would like the code to be there as well, but there is some confusion over what needs to be done to get MVCC to compile it.
pike-devel@lists.lysator.liu.se