Given the attached sample code. It tries to profile the difference between accessing a native mapping and accessing it through a class which is declared inline as much as possible. When I run it, I find that the native mapping is about twice as fast and the inlined amapping class. What would it take to make an amapping class less than a factor 1.2 slower than the native mapping?
Would being able to inherit from some kind of "_Builtin._mapping" class be of any use?
I ran the tests on Pike 8.1.
---------------------------------cut here-------------------- #!/usr/local/bin/pike
#define REPEAT 10000
class timeit { protected void create(string name, function(:void) fn) { int i,t1,t2; mixed e; t1 = gethrtime(); e = catch { for (i=REPEAT; i>0; i--) fn(); }; if (e) write("Skipped %s\n", name); else { t2 = gethrtime(); write("%f %s\n", (t2-t1)/(float)REPEAT, name); } } }
class amapping { private mapping values;
inline protected void create(mapping v) { values = v; }
inline private mixed `[](string key) { return values[key]; }
inline private mixed `->(string key) { return values[key]; }
inline private mixed `[]=(string key, mixed v) { return values[key] = v; }
inline private mixed `->=(string key, mixed v) { return values[key] = v; }
inline private Iterator _get_iterator() { return get_iterator(values); }
inline private array(string) _indices() { return indices(values); }
private string _sprintf(int type, void|mapping flags) { string res = UNDEFINED; switch (type) { case 'O': res = sprintf("%O", values); int indent; if (flags && (indent = flags->indent)) res = replace(res, "\n", "\n" + " " * indent); break; } return res; } }
int main(int argc, array(string) argv) { mapping m = ([ "abc":8, "acc":1, "adc":2, "aec":3, "afc":4, "agc":5, "ahc":6, "adc":2, "1dc":3, "3dc":2, "4dc":2, "5dc":8, "6dc":7, "7dc":2, "8dc":3, "rdc":2, ]); amapping am = amapping(m); timeit("mapping", lambda(){ int total = 0; foreach (indices(m);; string id) total += m[id]; return total; }); timeit("amapping", lambda(){ int total = 0; foreach (indices(m);; string id) total += am[id]; return total; }); return 0; } ---------------------------------cut here--------------------
The main challenge with making the class version fast is that you pay the additional function call overhead for the call to `[](). I think inlining does not work unless you are inlining something from within the same class (or parent maybe). It would be nice if the compiler would generate a fast path which assumes that your loop always calls `[]() in your class 'amapping', in that case it would be possible to optimize the function call in a way similar to what the 'fast_call' branch does with map(), etc.
el @ Pike developers forum wrote:
the additional function call overhead for the call to `[](). I think inlining does not work unless you are inlining something from within the same class (or parent maybe). It would be nice if the compiler
Can anyone confirm this? When declaring a function inline, when does it actually get inlined then?
pike-devel@lists.lysator.liu.se