The lightweight process pool

Unix Insider –

Let's start by revisiting what we know.

We know that Solaris implements a two-layer threads model. The model defines thread abstractions at both the kernel and user layers, and user threads are created using thread library interfaces.

In the kernel, there are two entities abstracted: the kernel thread and the LWP. We think of kernel threads and LWPs as a single entity because every kernel thread that exists on behalf of a user process has a corresponding LWP, and vice versa. We know from past columns that the kernel thread/LWP combinations is what's scheduled by the kernel on an available processor.

Lastly, we know that the scheduling of an unbound user thread requires that the user thread be linked to an available LWP. This happens at the threads library level, through well-defined preemption points in the library code.

This brings us to the subject at hand, the implementation of the pool of available LWPs for unbound threads. (Remember that by definition bound threads always have an available LWP.) If there are more runnable user threads in a process than available LWPs, user threads must wait until the pool of LWPs is replenished before they can be scheduled. Conversely, more LWPs than user threads require maintenance and consume resources (a kernel stack) without serving as an execution resource of threads. Thus, the system must continually find a balance between keeping enough LWPs available for runnable user threads and keeping the pool small enough so as not to waste kernel resources.

The original solution to this problem in Solaris was built on the signals model. Basically, if all the LWPs in the process were blocked, a

<font face="Courier">SIGWAITING</font>
signal was sent from the kernel to the threads library, where a
<font face="Courier">SIGWAITING</font>
signal handler would create a new LWP. This system works as long as user threads are issuing system calls and entering the kernel. If a user thread is compute bound in user code, it's possible for the runnable user threads to wait indefinite periods of time for an available LWP, as the kernel has no way of knowing if a particular process has runnable user threads and needs more LWPs.

The method of detection and replenishment of LWPs for the threads library went through some changes beginning in Solaris 2.6. A new subsystem built on doors was introduced. However, the original mechanism is still in place, and, under certain conditions,

<font face="Courier">SIGWAITING</font>
is still used. Let's take a closer look at the original implementation, and then look at the doors-based code.

A multithreaded program is any program that's been linked with the threads library at compile time. The program itself doesn't have to include any explicit calls to threads application programming interfaces (APIs); if the threads library is linked in, Solaris considers it to be a multithreaded program. For multithreaded programs, a couple of threads created on behalf of the threads library appear in the running process.

The following example shows a

<font face="Courier">dbx(1)</font>
session in a dummy program that was linked at compile time to
<font face="Courier">libpthread</font>
but doesn't make any threads API calls (in fact, it doesn't issue any system calls at all). The program, bd (big dummy), is shown in the
<font face="Courier">dbx</font>
list command below.

I added line numbers to the output to simplify subsequent annotations. Have a look.

<font face="Courier">
01 pae1> cc -g -o bd bd.c -lpthread
02 pae1> dbx bd
03 Reading bd
04 Reading
05 Reading
06 Reading
07 Reading
08 Reading
09 Reading
10 detected a multithreaded program
11 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) list 1,20 
12     1   #define _REENTRANT
13     2   
14     3   #include <stdio.h>
15     4   #include <stdlib.h>
16     5   #include <pthread.h>
17     6   
18     7   main(int argc, char *argv[])
19     8   {
20     9           int i= 0x7fffffff;
21    10   
22    11           while(i) {
23    12                   --i;
24    13                   if (i < 2)
25    14                           i=0x7fffffff;
26    15           }
27    16   }
28 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) stop at 13
29 (2) stop at "bd.c":13
30 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) run       
31 Running: bd 
32 (process id 1813)
33 t@1 (l@1) stopped in main at line 13 in file "bd.c"
34     13                   if (i < 2)
35 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) threads
36 *>    t@1  a l@1  ?()   breakpoint              in main()
37       t@2  b l@2  ?()   running                 in __signotifywait()
38       t@3  b l@3  ?()   running                 in __lwp_sema_wait()
39       t@4         ?()   sleep on (unknown)      in _reap_wait()
40 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) lwps  
41 *>l@1 breakpoint       in main()
42   l@2 running          in __signotifywait()
43   l@3 running          in __lwp_sema_wait()
44   l@4 running          in __door_return()
45 (/opt/SUNWspro/bin/../WS6/bin/sparcv9/dbx) 

The sample

<font face="Courier">bd</font>
program is compiled and started in
<font face="Courier">dbx</font>
control (lines 01 and 02), with the source dumped inside
<font face="Courier">dbx</font>
(lines 11 through 27). With a breakpoint set (line 28), the program is run (line 30) and the threads and LWPs are dumped (lines 35 through 44). Note again that this sample program was linked to
<font face="Courier">libpthread</font>
at compile time. As such, when it runs, it runs as any multithreaded program would run, even though the
<font face="Courier">bd</font>
program doesn't make any explicit thread API calls.

We can see that there are four threads in the process's context when we run

<font face="Courier">bd</font>
. Thread 1 (
<font face="Courier">t@1</font>
) is the main thread -- that is, it's the thread running the main program function. Threads 2, 3, and 4 are all blocked in a library function (lines 37, 38, and 39). Each thread has a corresponding LWP, as we see in lines 41 through 44. Threads created by the threads library on behalf of a process are bound threads, and because the test program didn't do a
<font face="Courier">pthread_create(3THR)</font>
call, we know that all the threads we're seeing in the
<font face="Courier">bd</font>
program are threads created by the library. Finally, we know all the threads are bound, because there's an LWP (lines 40 through 44) for every user thread (lines 35 through 39).

Let's turn our attention to the thread running in the

<font face="Courier">__signotifywait()</font>
routine (
<font face="Courier">t@2</font>
, line 37), and its corresponding LWP, also shown running in
<font face="Courier">__signotifywait()</font>
<font face="Courier">l@2</font>
, line 42). This is the
<font face="Courier">dynamiclwps()</font>
thread that gets created by the library when program execution begins.
<font face="Courier">dynamiclwps()</font>
sits in a tight loop, blocking in
<font face="Courier">signotifywait()</font>
and waiting for a
<font face="Courier">SIGWAITING</font>
signal. When a
<font face="Courier">SIGWAITING</font>
signal is received, the code will create a new LWP for the process.

There's a bit more to it than that. The library maintains counters for the number of runnable threads on the queue, the number of idle threads (idle threads are used to park idle LWPs that aren't currently linked to a thread), and the number of aging threads (aging is the mechanism used to time out and clean up unused LWPs). A new LWP is only created if there are runnable threads and a minimum of idle or aging LWPs.

The other side of this coin is the source of the

<font face="Courier">SIGWAITING</font>
signal. We know now what the library does upon receipt of a
<font face="Courier">SIGWAITING</font>
signal (create a new LWP), and why it exists (to replenish the LWP pool). But how does the
<font face="Courier">SIGWAITING</font>
signal get sent?

It's really quite simple. Given our goal of keeping LWPs available for the user threads so they have an execution resource, it seems intuitive that the kernel code that puts LWPs to sleep would be a reasonable place to check to see if we need more. This is basically what happens. When an LWP is placed on a sleep queue, one of several condition variable routines is used. Condition variables are synchronization primitives that allow an LWP to block waiting for a specific condition (e.g., an I/O completion; see my July 1999 article for an explanation of the kernel sleep/wakeup mechanism). When any one of several kernel condition variable routines is called to put an LWP to sleep, a

<font face="Courier">SIGWAITING</font>
signal may get sent from the kernel to the process, resulting in the
<font face="Courier">dynamiclwps()</font>

The operative word in the previous sentence is may. In older versions of Solaris (everything prior to Solaris 2.6), the

<font face="Courier">SIGWAITING</font>
mechanism was used exclusively, and the kernel always generated the
<font face="Courier">SIGWAITING</font>
signal when an LWP was put to sleep. Beginning in Solaris 2.6, a new subsystem was added to the kernel, referred to as scheduler activations or scheduler controls. This subsystem added several innovative features to the kernel, not the least of which was the improvement of the communication path between the kernel and the threads library.

The communication mechanism is built on the Solaris doors interprocess communication facility and a small pool of memory pages shared between the kernel and the threads library. A pool of LWPs is maintained to generate upcalls from the kernel into the threads library. When an LWP blocks, the condition variable code calls into the scheduler controls subsystem. The code determines if the LWPs in the process are blocked by testing a counter maintained in the process (

<font face="Courier">proc</font>
) structure. The
<font face="Courier">p_sc_unblocked</font>
field maintains the number of unblocked (running or idle) LWPs. An LWP will only be added to the pool for the process if that count reaches zero, meaning that all LWPs in the process are blocked.

Assuming that's the case and that there are upcall LWPs available in the pool, an LWP from the door pool is selected and made runnable. When the executing LWP has completed the transition to the

<font face="Courier">SLEEP</font>
state, the kernel immediately switches to the new LWP from the pool, which will execute the threads library idle loop, looking for a runnable user thread. If the door pool is empty, the system will fall back on the traditional
<font face="Courier">SIGWAITING</font>

In either case, we have a new LWP, which must now look for work. This makes a great segue into the other side of LWP pool maintenance: aging unused LWPs and cleaning up resources. The implementation is really quite simple. A new LWP will execute a specific function in the threads library that looks for runnable threads in the library's queue. If none exist, the LWP blocks on a timed condition variable. If the timer expires (which takes five minutes by default) and the LWP has not executed a user thread, the LWP will terminate.

So within Solaris, we have a means of replenishing the LWP pool for a process and cleaning up unneeded LWPs that have been lying around. The goal is to provide relatively balanced and scalable performance for applications that use unbound user threads. That said, it's worth noting that the threads library provides an interface for giving the kernel a hint as to how many LWPs you'd like in the pool. The

<font face="Courier">thr_setconcurrency(3THR)</font>
<font face="Courier">pthread_setconcurrency(3THR)</font>
calls take an integer value as an argument, and result in the library creating n new LWPs, where n is the number requested as an argument in the concurrency call. It is generally recommended that an application using unbound threads use the concurrency interfaces.

Also, Solaris threads provide a

<font face="Courier">THR_NEW_LWP</font>
flag option for the
<font face="Courier">thr_create(3THR)</font>
call. This allows the programming to instruct the system that a new LWP should be created with the user thread. This isn't the same thing as a bound thread. Using the
<font face="Courier">THR_NEW_LWP</font>
flag in
<font face="Courier">thr_create(3THR)</font>
does not result in the newly created LWP being bound to the newly created thread. It has the same net effect of doing a thread create call, followed by a set concurrency call with and argument of 1 -- it adds one more LWP to the pool.

That's it for this month. There's more interesting stuff to write about in the coming months; I just need to decide what's next. A number of requests have come in for articles on the socket and door filesystems (which I did promise a long time ago). There are also some changes to the threads library in Solaris 8 that would track with our current thread of columns (no pun intended).

Stay tuned.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon