Re: Optimised map

7 Apr 2016


      I have looked into this exact problem a while ago. One thing I noticed,
which might be relevant for your code (which I have not looked at too
closely, yet), is that in case of tail recursion frames can be replaced.
This might be something that your code needs to handle (for instance
when calling a recursive function with map).
When I attempted to improve the performance of function calls (and in
particular map/automap/filter/...) I started by refactoring the API for
handling frames and doing function calls. I came up with something like
frame_prepare(), frame_execute(), frame_return(), frame_pop(), etc.
I kept all the apply variants as fallbacks, which simply call the right
combination of these new functions. I then implemented the map
optimization similar to the one that you did. At some point I mentioned
this work on the list (or maybe also just in person) but I never pushed
it anywhere. I also never got around to finish this, but it is something
I would still like to find some spare time for. I will push my branch
now, maybe it is of interest.
Arne
On 04/06/16 00:12, Martin Karlgren wrote:
...
Hi,
Having realised the benefits of functional programming, I’ve been quite annoyed by the rumour of how expensive function calls are in Pike. I decided to look into f_map and could see how much seemingly unnecessary work it does when it calls apply_svalue once for each entry in the array – it should be possible to reuse the pike_frame if it’s about to be thrown away after the function call (if it has other refs on the other hand, it can’t be reused – it’s probably used as a scope in some other frame).
I’ve pushed my optimised variant in marty/optimised_map – it seems to work quite well and provides a major speedup. In fact, it’s a bit faster than the corresponding foreach variant. I haven’t verified correctness in various corner cases, and some input on whether it’s correct to do the things init_frame_reuse_context does only once before multiple function calls would be nice too. The *_reuse_context stuff in interpret.c should be applicable wherever the same svalue is applied repeatedly with the same number of arguments (I haven’t looked for it outside of f_map really).
What do you all think? Good idea or did I overlook something?
Without optimisation:
map: 1.660797
array index: 1.335115
array append: 1.17917
With optimisation:
map: 0.877659
array index: 1.351158
array append: 1.189812
Test program:
int main()
{
  array base = allocate(10000000, 1);
float gmap = gauge {
      array res = map (base, lambda(int i) { return i + 2; });
    };
float garrayindex = gauge {
      array res = allocate(sizeof(base));
      foreach (base; int idx; int i) {
        res[i] = i + 2;
      }
    };
float garrayappend = gauge {
      array res = ({});
      foreach (base, int i) {
        res += ({ i + 2 });
      }
    };
werror ("map: %O\n", gmap);
  werror ("array index: %O\n", garrayindex);
  werror ("array append: %O\n", garrayappend);
}
/Marty

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Optimised map