Ah, very interesting. Now that you mention it, I recall seeing something about this on the list but apparently I failed to recall it before implementing something on my own… ;) Your refactoring looks much more appealing though, so I’ll definitely vote for your branch. What’s left to be sorted out?
Another idea I had, which I don’t know whether it’s feasible or not, was to populate a frame cache in the parent frame when a function is likely to be called repeatedly (e.g. inside foreach) – but let’s sort out the basic stuff first...
(I have very little spare time because of kids, but every once in a while I’ll make a “frustration hack” ;) )
/Marty
On 07 Apr 2016, at 10:49 , Arne Goedeke el@laramies.com wrote:
I have looked into this exact problem a while ago. One thing I noticed, which might be relevant for your code (which I have not looked at too closely, yet), is that in case of tail recursion frames can be replaced. This might be something that your code needs to handle (for instance when calling a recursive function with map).
When I attempted to improve the performance of function calls (and in particular map/automap/filter/...) I started by refactoring the API for handling frames and doing function calls. I came up with something like frame_prepare(), frame_execute(), frame_return(), frame_pop(), etc.
I kept all the apply variants as fallbacks, which simply call the right combination of these new functions. I then implemented the map optimization similar to the one that you did. At some point I mentioned this work on the list (or maybe also just in person) but I never pushed it anywhere. I also never got around to finish this, but it is something I would still like to find some spare time for. I will push my branch now, maybe it is of interest.
Arne
On 04/06/16 00:12, Martin Karlgren wrote:
Hi,
Having realised the benefits of functional programming, I’ve been quite annoyed by the rumour of how expensive function calls are in Pike. I decided to look into f_map and could see how much seemingly unnecessary work it does when it calls apply_svalue once for each entry in the array – it should be possible to reuse the pike_frame if it’s about to be thrown away after the function call (if it has other refs on the other hand, it can’t be reused – it’s probably used as a scope in some other frame).
I’ve pushed my optimised variant in marty/optimised_map – it seems to work quite well and provides a major speedup. In fact, it’s a bit faster than the corresponding foreach variant. I haven’t verified correctness in various corner cases, and some input on whether it’s correct to do the things init_frame_reuse_context does only once before multiple function calls would be nice too. The *_reuse_context stuff in interpret.c should be applicable wherever the same svalue is applied repeatedly with the same number of arguments (I haven’t looked for it outside of f_map really).
What do you all think? Good idea or did I overlook something?
Without optimisation: map: 1.660797 array index: 1.335115 array append: 1.17917
With optimisation: map: 0.877659 array index: 1.351158 array append: 1.189812
Test program:
int main() { array base = allocate(10000000, 1);
float gmap = gauge { array res = map (base, lambda(int i) { return i + 2; }); };
float garrayindex = gauge { array res = allocate(sizeof(base)); foreach (base; int idx; int i) { res[i] = i + 2; } };
float garrayappend = gauge { array res = ({}); foreach (base, int i) { res += ({ i + 2 }); } };
werror ("map: %O\n", gmap); werror ("array index: %O\n", garrayindex); werror ("array append: %O\n", garrayappend); }
/Marty