And speaking of Concurrent, since MutexKey is destructed immediately when it goes out of scope, I think many of the key=0 can be removed from the code.
I've not looked at this specific case, but you have to be careful with tail calls and mutex keys, as it is possible for tail call optimizations to clear the key before the tail call.