There will still be a lot of locking operations,
The intention is that the locking strategy would keep that on the same level as the current interpreter lock. Have I missed something?
and there will be a *lot* of places where the local/global flag would have to be checked.
Everywhere where a memory object is changed or freed, but not where it only is read. Trivial per-object locking would have to lock for both operations, which I think is a big difference. It's hard to estimate the read-to-write ratio (not counting the stack, which is thread local and therefore unlocked in any case), but I chance it's around 10:1 at least.
In fact, I'm not entirely sure that it's mutch faster than having one mutex per object, because locking an unlocked mutex takes very little time. (Similar to checking a flag.)
I don't know what you mean with "similar", but it should be at least four times more; it's a read-and-write operation and it's necessary to do an unlock operation afterwards. Furthermore the lock is atomic meaning cache synching etc which ought to be a fair bit more expensive (but only on SMP systems) - the flag check need not be atomic.
The necessary atomic refcounting on global data is a similar expense, though. Hmm, maybe one could relax the refcount garbing on global data and let it become garbage instead. Assuming that most short lived objects are thread local it might not give that much more garbage afterall. It's an interesting thought - the refcounting is also fairly expensive in itself. Pity there's no way of testing it without actually implementing it.
A problem with that is also that it'd be a semantic change since global stuff would stay around longer. There's much pike code that relies on timely refcount garbing. :\
/ Martin Stjernholm, Roxen IS
Previous text:
2004-02-03 07:22: Subject: Re: Default backend and thread backends?
It's a good plan, and I think it has a good chance of working. I would combine it with an addition go gc() that returns the thread-local flag on things which are no longer global automatically.
However, I'm still not entirely sure that this will be fast enough to be worth it. There will still be a lot of locking operations, and there will be a *lot* of places where the local/global flag would have to be checked.
In fact, I'm not entirely sure that it's mutch faster than having one mutex per object, because locking an unlocked mutex takes very little time. (Similar to checking a flag.)
/ Fredrik (Naranek) Hubinette (Real Build Master)