I’ve been thinking about adding support for named threads and would like to get some feedback on my proposed solution.
Named threads can be useful when, for example, debugging or printing a thread dump. There are varying levels of OS support for named threads, ranging from none at all (older versions of illumos, and Solaris before 11.3) to being able to name any thread at any time (glibc, some of the BSDs, etc).
I’d like to shoot for a level of functionality that should be possible for all cases:
- Ability to set a thread’s name at creation time - Ability to get a thread’s name any time - Add thread name to thread dump, if set
A brief survey of the API provided by other languages suggests that most seem to accept a thread name following the function to be run. Since the existing Pike thread API accepts varargs after the function, there’s no way to determine when a thread name has been provided. With this in mind, I suggest the opposite:
variant protected void create(string thread_name, function f, mixed … args); public string name();
Assuming `()() can have variants, a similar variant would be created in Thread.Thread as well.
On systems that don’t support naming threads, a parameter would be added either to the pike thread object itself or to the underlying OS thread’s thread specific data. In this situation, the thread name would be available through pike, but not in gdb, etc (but it wouldn’t have been anyway, so no big loss there).
Some questions:
Each OS has its own limit on the length of the thread name. What should the behavior be? Fail if the length is too long, truncate it silently, or set the maximum length but store the full length somewhere so Pike code can see the full length? There seems to be some difference in behavior, with Solaris silently truncating and the others indicating errors.
Being able to set thread names after creation seems like a possibly useful feature. Darwin permits setting the thread name only on the current thread. So, if there were a desire to be able to set the thread name outside thread creation, this would be a cross-platform limitation if we wanted to maintain consistency. Is that a desirable behavior, and is such a limitation warranted? Or, do we emulate the capability by insert code somewhere that indicates the name has been changed and when the current thread is about to run, change the name at that time?
What else have I missed?
As always, comments and suggestions are welcome!
Bill
H. William Welliver III wrote:
On systems that don???t support naming threads, a parameter would be added either to the pike thread object itself or to the underlying OS thread???s thread specific data. In this situation, the thread name would be available through pike, but not in gdb, etc (but it wouldn???t have been anyway, so no big loss there).
Your thread naming solution sounds good.
It is my understanding (even though I have never delved deep into the threading implementation of Pike myself), that most Pike code runs in the equivalent of a single OS thread. Yes, there is some parallelism, but in a very limited way during execution of "native" C-code.
So, it makes me wonder in what way there even is a one-to-one mapping of threads inside Pike and threads at the OS level?
My experience with threading in Pike is that there are at least as many OS level threads as Pike threads, though there may be additional OS level threads created outside of Pike itself (such as libraries and such). Only one thread may be running code that operates on Pike data structures at any given time (coordinated by the Global Interpreter Lock.) Obviously running pike code falls into this category. A thread holding the lock may release it if it does not need to work with Pike data structures for an extended time, such as when running a database query, etc. If an "external" thread needs to run Pike code, it needs to be registered with the runtime (so that the GIL can operate properly).
Bill
On Wed, 29 May 2019, Stephen R. van den Berg wrote:
H. William Welliver III wrote:
On systems that don???t support naming threads, a parameter would be added either to the pike thread object itself or to the underlying OS thread???s thread specific data. In this situation, the thread name would be available through pike, but not in gdb, etc (but it wouldn???t have been anyway, so no big loss there).
Your thread naming solution sounds good.
It is my understanding (even though I have never delved deep into the threading implementation of Pike myself), that most Pike code runs in the equivalent of a single OS thread. Yes, there is some parallelism, but in a very limited way during execution of "native" C-code.
So, it makes me wonder in what way there even is a one-to-one mapping of threads inside Pike and threads at the OS level? -- Stephen.
pike-devel@lists.lysator.liu.se