Lambdas, captured variables - Pike-devel

26 Oct 2016


      Hi all,
I’m sometimes annoyed by the trampoline garbage that results when referencing variables in the surrounding scope from a lambda function (“closure”). Also, all variables in the frame referenced by the lambda will be kept around, even if just a single one is actually used in the lambda.
void foo()
{
  string var1;
  mapping var2;
  string bar = “whatever";
function f = lambda(string arg)
               {
                 write("%O, %O\n", bar, arg);
               };
… other code that prevents tail-call optimization ...
return f;  
}
int main(int argc, array argv)
{
  function f = foo();
  f(“gazonk”);
  f = 0;
  // The lambda and the variables referenced in the frame used to execute “foo” are now garbage.
}
A possible workaround is to cut the reference to the “foo” frame:
function f = lambda(string var1, string var2)
               {
                 return lambda(string arg)
                        {
                          write("%O, %O, %O\n”, var1, var2, arg);
                        };
               }(var1, var2);
However, this is pretty verbose. Maybe it would make sense to introduce some kind of syntactic sugar, for example:
lambda[var1, var2](string arg)
{
};
This could perhaps translate to the construct above, and I think this syntax would map pretty well to how explicitly captured variables work in other languages. Some languages allow renaming of captured variables – not sure if it’s necessary or not. "lambda[myvar1 = var1, myvar2 = var2]() {}" or something.
Ideally, variables from the surrounding scope not present in the capture list should not be usable in the lambda, to avoid unintended trampolines. Obviously, this would follow the same pass-by-value/pass-by-reference semantics of the various Pike types as in function arguments, which I personally think is a good thing. (A special case would be "lambda[](string arg) { }” which would prevent referencing any variables from surrounding scopes at all.)
What do you all think?
/Marty