Re: Lock free hash mapping

2 Dec 2008


      ...
/.../ We should focus on improving the hash functions in general.
True. Grubbas way of getting rid of periodicities in the pointers
sounds reasonable.
...
/.../ Then whenever it seems necessary to grow the hashtable,
quickly sum up all the live counters and you'll have a fairly
accurate population count of the total hashtable. If the ratio is
below a certain threshold, do not grow the table.
Yes, that ought to work. The cost is then that all threads starts
summing all those counters more or less every time when the reprobe
limit is busted regularly, thus leading to frequent cache line
synching on the counters. But then again, if the hash function is bad,
things will get bad.. But maybe one could consider ways to dynamically
raise the reprobe limit.
We'd need a somewhat accurate sizeof() anyway if it should replace the
standard mappings(*), and to implement shrinking.
Speaking of counters, that brings me to another issue: Mr Click very
conveniently avoids the problem with the free of the old table after
resize since he leaves it to the java gc. It's not that simple for us.
I guess we'll have to keep similar concurrent access counters for the
references to the hash table blocks. Even so, it's still not trivial
to free it in a safe way.
That concurrent access counter is also something that will see use in
many places (although the general refcounting will probably be
disabled for most shared data), so that in itself is something that
should be implemented with care. I'm thinking something like this:
An array of counters is allocated for every thread, and there is a
function that allocates a counter by returning a free index in those
arrays. So every thread has its own counter array in the local cache
in r/w mode, while the others only need (more or less) occasional read
access. The allocation and freeing of counters should also be
lock-free, although that's not strictly necessary.
That should work, I think. One other wisdom from Mr Click is that one
should never put concurrent access counters in the same memory blocks
that they governs. I don't think we can afford to follow that advice
for all refcounters, though.
*) Something I actually consider somewhat optional - the plan is first
and foremost to implement an efficient concurrent hash table class
that can be used instead of the standard mapping in places where it is
necessary. If the standard mappings can be made lock free too then all
the better, but that depends on if their semantics can be kept. The
most important hash tables are actually the string hash table and the
various other internal hash tables that has to allow concurrent use.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Lock free hash mapping