We pushed an experimental branch which uses a new block allocator we have been developing. This one uses a different algorithm than the one we briefly mentioned on the conference. The new one uses a hashtable for page lookup instead of radix trees.
The branch can be found under arne/block_alloc, a GJAlloc-bundle is available at http://fortranland.com/GJAlloc-0.08.tar.gz or from https://github.com/arneg/GJAlloc via git. The branch currently contains alot of obsolete history, which we would squash into one commit in case of inclusion.
Standalone performance comparisons are available at https://github.com/arneg/GJAlloc/wiki/Performance-comparison. The speedup is quite noticeable when compiling with --cleanup-on-exit, otherwise for most use cases the differences should be unnoticable. However, when having many things around that do not get freed linearly, the speedup can be dramatic. A rather extreme example is
time pike -e 'array a = map(allocate(1000000, 5), random_string); for (int i = 0; i < 10000000; i++) { int k = random(sizeof(a)); a[k] = random_string(5);}'
It would be interesting to see real world benchmarks, in case anyone feels like it ,)
We would naturally like to incorporate this into Pike (7.9) and hope for a discussion about the allocator itself and how to best include it. Currently, GJAlloc is requested as a bundle by the CritBit module, which uses its API directly. All other modules use macro definitions which are compatible with the ones from block_allocator.h.
best
arne & tobi
Arne Goedeke wrote:
have been developing. This one uses a different algorithm than the one we briefly mentioned on the conference. The new one uses a hashtable for page lookup instead of radix trees.
The benchmarks are impressive. However, using a hashtable always pops up the question, what is the worst case for this implementation, and when does it happen?
On Tue, 3 Apr 2012, Stephen R. van den Berg wrote:
Arne Goedeke wrote:
have been developing. This one uses a different algorithm than the one we briefly mentioned on the conference. The new one uses a hashtable for page lookup instead of radix trees.
The benchmarks are impressive. However, using a hashtable always pops up the question, what is the worst case for this implementation, and when does it happen?
The worst case will be roughly the same as the lookup happening for the current one. Pointers to the pages are stored in the hashtable. So if you are very unlucky and all page pointers end up in the same bucket, you will be doing the same linear search of a linked list thats currently the lookup algorithm in the pike block_alloc.h. I never got around to trying to somehow include the radix tree lookup again, there of cause worst case would be much nicer.
-- Stephen.
"Sleep: A completely inadequate substitute for caffeine."
We pushed an experimental branch which uses a new block allocator we have been developing. /.../
Very interesting. I've barely looked at it, but questions are jumping out:
I guess you already told us at the conference, but what was your motivation for creating it?
Is std::allocator the same thing as the (g)libc malloc in your graphs?
I take it you already got it working as a replacement for the old BLOCK_ALLOC, including that quirky heap walker that the old one sports? Do you have any test where they are compared?
/.../ Currently, GJAlloc is requested as a bundle by the CritBit module, which uses its API directly.
Does it add any features?
All other modules use macro definitions which are compatible with the ones from block_allocator.h.
You mean the old block_alloc.h?
On Mon, 2 Apr 2012, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
I take it you already got it working as a replacement for the old BLOCK_ALLOC, including that quirky heap walker that the old one sports? Do you have any test where they are compared?
The heapwalkers basically work the same way. The order the pages are walked is different. Otherwise, they both use magic values to distinguish free from used blocks. For that there is a new macro called WALK_NONFREEB_BLOCKS for both the old and the new allocator.
/.../ Currently, GJAlloc is requested as a bundle by the CritBit module, which uses its API directly.
Does it add any features?
Not really. The reason we started this initially was because the current block allocator was actually slower than malloc for critbit trees in our random tests.
All other modules use macro definitions which are compatible with the ones from block_allocator.h.
You mean the old block_alloc.h?
yes, right.
On Mon, 2 Apr 2012, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Does it add any features?
Well, yes. Actually, there is one thing. Since its not all macros it would be possible to have several ones around just for one task and free everything in one go at the end. Something similar to what the gc is doing right now, although there its only one gc at a time, anyway ,) And then, it might be even handier to have one which never frees, except right at the end when all pages are discarded.
pike-devel@lists.lysator.liu.se