Tobias S. Josefowitz @ Pike developers forum wrote:
Am I reading those correctly that both are upon thread creation?
First, nice backtrace. I think it's more that any thread started ever will have been started by thread creation, and that's what we're seeing here, not that this necessarily immediately follows thread creation.
That was the other explanation, but since I don't know enough about the innards of Pike in this respect, I wasn't quite sure.
#ifdef PIKE_USE_MACHINE_CODE call_check_threads_etc(); #endif
I assume you have machine code enabled. call_check_threads_etc()
Normally I do, however, to get to the bottom of this, I temporarily compiled with: gcc-9 -g -O1 -pipe -DPIKE_DEBUG=1 --with-cdebug --with-rtldebug --with-valgrind --with-double-precision --with-long-int \ --disable-noopty-retry \ --without-machine-code \ --with-poll \ --with-portable-bytecode \
indeed may schedule other threads and stuff and we may return from it with the object we just looked up the identifier in destructed. When we then call the identifier, we will indeed call into a destructed object.
I already had the distinct impression this was happening. In the pgsql driver I've had numerous issues over the years where I had to cover for methods running in destructed objects (because the driver is asynchronous to the bone). But in most cases this did not result in segfaults, so I'm not quite sure if the rest of the Pike system is more robust against this naturally. I have one unexplained segfault there of about six months ago, but that was so hard to reproduce, that it might as well have been the same problem I'm chasing now.
Now, what to do about it... indeed check at every function entry that we're not destructed? I don't know, but that just doesn't feel cool.
Actually, it seems like Pike actually does this check before *every* access of a local variable. I.e. in those numerous cases I had to deal with in pgsql I invariably got an exception like "lookup in destructed object". With that in mind, a check upon function entry does not sound so bad.
P.S. Speaking about asynchronous destructs. One of the "fun" things I discovered about three months ago was that since the actual destruct() method can also be called while being inside a totally random stackframe (very far beyond the stackframe where the object scope actually already ended), it is quite hazardous to try to acquire any kind of mutex from within a destruct() method, since it can result in very random and very rare deadlock situations.