Ok, not sure yet if this is going anywhere this time, but I decided to attempt a second time to refactor the current function call code into something which is more readable, more flexible and hopefully faster (at some point). The branch is called 'arne/faster_calls_again'.
The API is similar to what the previous version did, the main difference being that it allocates pike frames only if needed (not for efuns, casts, apply_array, etc). The changes in that branch start by first removing code which was duplicated at some point to increase performance for certain types of calls. I then added a series of new functions to initialize, execute and deinitialize a callsite. Then, I started to use it in some of the existing function call APIs and also implemented that optimized version of f_map.
The code is probably broken in interesting ways, I have not tested it very well, yet. It compiles and installs.
Otherwise on my list right now:
* cleanup * reduce API granularity somewhat * re-add tracing and dtrace code (could all be in one place now) * optimize automap * try to optimize apply_array (e.g. when calling the same function in different instances of the same class) * make tail recursion not allocate a new frame, instead just re-use the current one
Feedback and help welcome,
Arne