Re: new block allocator

17 Apr 2012

      On Mon, 16 Apr 2012, Martin Stjernholm,	Roxen IS @ Pike  developers forum wrote:
...
...
There is updated graphs on
https://github.com/arneg/GJAlloc/wiki/Performance-comparison. Turns
out, in fact, I misattributed benefits to the std::allocator it does
not have.
Nice, thank you. Afaik there are no near-term 7.9 release plans, so
I'm for letting this allocator go in, provided there's a configure
option to switch to the old one.
I'd rather have it directly in the source than as a bundle, but I
guess it's up to you really. (I tried to symlink the bundle dir to
your upstream git, but I couldn't get that to work very well - clearly
the bundle system isn't made with that in mind. But that's fixable, of
course.)
When we first wrote it, we did this within the pike tree. However, after
our benchmarks did not quite give the same result we got when directly
comparing the old one to the new one in micro-benchmarks (e.h. test.cpp
in the gjalloc source), we decided to move it into a seperate library,
since we were not sure if it will ever be included.
The completely inlinable macros the old allocator is using seem quite
hard to beat. I see no problem however, merging it back into the tree.
It might still be useful as a seperate package, but thats another
matter.
...
...
And my bad again. Currently the code does not do this, for
comparability between the two block allocators as used by CritBits,
I think.
Do you have any plans for switching to object-local storage? I'm not
sure a single shared pool would do that well if there are many objects

I suspect it'd be bad for locality, and if we ever go properly

multi-cpu it'd produce a lot of false sharing in cpu caches.
It would be worthwhile to test the overhead when only using few blocks.
The allocator tries to make initialization as cheap as possible, so it
might not even hurt performance much. On the other hand, memory overhead
could be quite high.
We had some plans to make a thread safe allocator out of it, allocating
blocks from thread local pages, etc. However, it never got any further
than some thinking.
...
When I implemented the new multisets based on red-black trees, the
node memory management was actually the most time consuming part. My
implementation separates the nodes for each multiset, and moves them
to be able to resize the memory blocks, and to compact them to limit
internal fragmentation. It always allocates a single block for the
whole multiset, regardless how large. It'd probably be better to use
multiple blocks when it fills more than one page, but it could still
be important to support moving nodes to be able to compact smaller
trees.
Note that handling moving nodes has big effect on the iterators. They
need to address nodes by index rather than by direct pointer, if they
should be able to handle simultaneous change in the tree. For
multisets there's a "semi-frozen" state when there are active
iterators: Since iterators address by index, the memory block as a
whole may move (and thus be enlarged), but the nodes inside it may
not, so compaction is postponed until all iterators are gone. (There's
a comment near the top of multiset.h that goes into detail on this.)
So my point is just that handling object-local storage of the nodes
could have a quite big effect on the code.
Interesting, will have a read there. What we did in the CritBit
implementation is much simpler than that. Modifications of the tree while
iterators are active forces the iterator to make one lookup next time.
It can potentially hurt performance quite a lot, though.
There is other places where object-local storage might help. Mapping
rehash for instance could be made much simpler, now that we have power
of 2 hashtables (the bucket n gets split into bucket n and n*2 when the
hashtable grows). However, I dont know of any easy way to make that work
properly with iterators.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: new block allocator