We pushed an experimental branch which uses a new block allocator we have been developing. This one uses a different algorithm than the one we briefly mentioned on the conference. The new one uses a hashtable for page lookup instead of radix trees.
The branch can be found under arne/block_alloc, a GJAlloc-bundle is available at http://fortranland.com/GJAlloc-0.08.tar.gz or from https://github.com/arneg/GJAlloc via git. The branch currently contains alot of obsolete history, which we would squash into one commit in case of inclusion.
Standalone performance comparisons are available at https://github.com/arneg/GJAlloc/wiki/Performance-comparison. The speedup is quite noticeable when compiling with --cleanup-on-exit, otherwise for most use cases the differences should be unnoticable. However, when having many things around that do not get freed linearly, the speedup can be dramatic. A rather extreme example is
time pike -e 'array a = map(allocate(1000000, 5), random_string); for (int i = 0; i < 10000000; i++) { int k = random(sizeof(a)); a[k] = random_string(5);}'
It would be interesting to see real world benchmarks, in case anyone feels like it ,)
We would naturally like to incorporate this into Pike (7.9) and hope for a discussion about the allocator itself and how to best include it. Currently, GJAlloc is requested as a bundle by the CritBit module, which uses its API directly. All other modules use macro definitions which are compatible with the ones from block_allocator.h.
best
arne & tobi