From: www.itworld.com
March 16, 2001 —
Last month, we saw that the threads library implements a relatively simple queue of runnable threads, in which threads at the same user-thread priority are maintained on a linked list. Each list of threads is rooted in an array, and there's one array location for each of the 128 (0 through 127) possible thread priorities. User-thread scheduling is accomplished by the unbound threads calling into preemption and dispatcher routines at various points in the code path, such as when a thread blocks on a user-defined synchronization object.
User threads can inherit their priority from their calling thread. This thread will have a default value of zero, though it can be explicitly increased with a pthread_attr_setschedparam(3thr) or pthread_setschedparam(3thr) call. The difference between the two interfaces can be summarized as follows:
pthread_attr_setschedparam(3thr) makes use of an attribute object (see last month's column for a list of thread attributes), which can be set and passed as an argument to a pthread_create(3thr) call. Once the thread has been created, attribute changes can't be made, because there's no runtime linkage between a user thread and an attribute object.
pthread_setschedparam(3thr) takes a thread ID as an argument, and can alter the priority of a running thread. For that reason, the desired attributes of a thread must be determined prior to the thread's creation, as most of them cannot be changed once the thread has been created.
The priorities of Solaris threads can be altered using either the thr_setprio(3thr) or thr_getprio(3thr) interface to retrieve the threads' priority.
Three possible scheduling policies are available for POSIX threads. The default policy is SCHED_OTHER, which is defined as implementation-specific. In Solaris, it provides for a new thread inheriting the scheduling policy of the creator thread. By default, it provides timeshare or interactive scheduling behavior. Note that POSIX threads provide an attribute allowing for a thread to ignore inherited scheduling policy. (See pthread_attr_setinheritsched(3thr).) We'll revisit that idea in a moment. For now, assume the default behavior if inheriting scheduling policy.
POSIX also provides for SCHED_FIFO (first in, first out) and SCHED_RR (round robin) policies. Solaris support for these policies first appeared in Solaris 7. Both scheduling policies will cause the newly created thread to be placed in the realtime scheduling class if the thread is bound. In other words, if the contentionscope attribute is PTHREAD_SCOPE_SYSTEM (which is how one creates bound threads using POSIX interfaces), the threads library issues a priocntl(2) system call internally to place the bound pthread in the realtime scheduling class. As such, the effective user identification (UID) of the user executing the code must be root, as only root can place processes and threads in the realtime scheduling class.
The difference between the two policies when dealing with bound threads is somewhat subtle. The documentation indicates that a SCHED_FIFO thread will execute to completion unless it's preempted by a higher priority thread. A SCHED_RR thread will execute for a time period determined by the system, which translates to the time quantum assigned to the kernel thread, based on its global priority.
The documentation effectively says that bound SCHED_FIFO threads are not held to the time quantum defined for their global priority, whereas SCHED_RR threads are. In the library code, SCHED_RR threads have their time quantum set to the default quantum for the thread's global priority. SCHED_FIFO threads have their time quantum set to infinity -- essentially, no time quantum. This gets back to understanding the kernel implementation of scheduling classes, which is discussed in the back issue mentioned above.
The key points to note in getting your arms around these two POSIX thread scheduling policies (SCHED_FIFO, SCHED_RR) are:
PTHREAD_SCOPE_SYSTEM) and the effective UID of the caller is root.
If you reference the priocntl(2) man page, you'll see that placing an LWP in the realtime class requires an associated rtparms structure, which is defined in /usr/include/sys/rtpriocntl.h:
/*
* Real-time class specific structures for the priocntl system call.
*/
typedef struct rtparms {
pri_t rt_pri; /* real-time priority */
uint_t rt_tqsecs; /* seconds in time quantum */
int rt_tqnsecs; /* additional nanosecs in time quant */
} rtparms_t;
The sys/rtpriocntl.h header file defines several special values that can be assigned to the rt_tqnsecs field of the rtparms structure: RT_TQDEF (realtime default time quantum) and RT_TQINF (realtime infinite time quantum). These special values make it relatively straightforward to programmatically establish either default or infinite time quanta for realtime threads. Within the POSIX thread library code, a policy of SCHED_FIFO results in rt_tqnsecs getting set to RT_TQINF. The SCHED_RR policy will result in (you guessed it!) rt_tqnsecs getting set to RT_TQDEF.
So, the net-net of POSIX scheduling policies is that they provide an easy way to place user threads in the realtime scheduling class. In a multithreaded program where it's neither required nor desirable to have all the threads in the process in the realtime class, but rather preferable to have only a select few there, this can be a very useful feature. Note that the exact same behavior can be obtained using bound threads and the priocntl(2) system call from code. We illustrate this in the following code segments. Note that these are not complete programs; they can be compiled and executed as they stand, but they only serve to illustrate this point. Also note that, for readability and space, some system calls are shown without testing for return codes. This is certainly not something anyone would do with production code. The code segments below were tested on a Solaris 8 system, using the Forte C compiler, version 6.0.
Both programs have the thread do a simple forever loop, so we can examine the thread/LWP class and priority without the use of a debugger.
/* * Creating realtime class threads using the POSIX * defined scheduling policies. * * Compile as: cc -o output_filename source_file.c -lpthread */ #define _REENTRANT #include#include #include #include #include void *pt1(); main(int argc, char *argv[]) { int policy; pthread_t p1id; pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); policy=SCHED_RR; pthread_attr_setschedpolicy(&attr, policy); pthread_create(&p1id, &attr, pt1, NULL); pthread_join(p1id, (void **)NULL); . . } /* * The thread */ void *pt1() { pthread_t tid; long l; tid=pthread_self(); printf("Pthread ID: %d\n",tid); loop: l = 1; while (l < 0x7fffffff) l = l * 2; goto loop; }
The code sample above creates a bound thread with a SCHED_RR scheduling policy, resulting in a realtime class thread. The setup is done prior to the actual thread create, and in the main part of the program.
The code sample below provides the same net behavior, but uses the priocntl(2) system call instead of the POSIX interfaces.
/* * Create a bound POSIX thread and put it in the * RT scheduling class. * * Compile as: cc -o output_filename source_file.c -lpthread */ #define _REENTRANT #include#include #include #include #include #include #include #include #include #include void *pt1(); main(int argc, char *argv[]) { pthread_t p1id; pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); pthread_create(&p1id, &attr, pt1, NULL); pthread_join(p1id, (void **)NULL); printf("Main exiting...\n"); exit(0); } void *pt1() { lwpid_t lwpid; pthread_t ptid; pcinfo_t pcinfo; pcparms_t pcparm; long l; ptid = pthread_self(); lwpid = _lwp_self(); printf("Pthread ID: %d, LWP ID: %d\n",ptid, lwpid); (void) strcpy(pcinfo.pc_clname, "RT"); pcparm.pc_cid = PC_CLNULL; if (priocntl(0, 0, PC_GETCID, (caddr_t)&pcinfo) < 0) { perror("priocntl getcid"); pthread_exit((void *)NULL); } pcparm.pc_cid = pcinfo.pc_cid; ((rtparms_t *)pcparm.pc_clparms)->rt_tqnsecs = RT_TQDEF; if ((priocntl(P_LWPID, lwpid, PC_SETPARMS, (caddr_t)&pcparm)) == -1) { perror("priocntl setparms"); pthread_exit((void *)NULL); } loop: l = 1; while (l < 0x7fffffff) l = l * 2; goto loop; }
The most salient difference between the two sample programs is that in the second example, setting up the thread/LWP to run in the realtime class is done within the actual user thread (as opposed to the main thread, which was the case with the first example). In this code segment, the ID of the LWP that the user thread is bound to is needed, because it must be passed as an argument to the second priocntl(2) system call in order to set the new scheduling parameters.
The code itself is pretty straightforward, and relatively easy to follow if you have ready access to the priocntl(2) man page and the sys/priocntl.h and sys/rtpriocntl.h header files. It involves first setting the pc_clname character string to RT, and then getting the class ID for the realtime class via the first priocntl(2) call (using the PC_GETCID command). The class ID is set in the pc_cid member of the pcparm structure, and the desired behavior is set by determining the number of seconds in the time quantum to T_TQDEF. This indicates that the system is to use the default time quantum from the dispatch table entry, and is the equivalent to the POSIX SCHED_RR behavior. If the SCHED_FIFO behavior is desired, set the rt_tqnsecs value to RT_TQINF.
We can take a quick look at the sample program by running it in the background and using the appropriate ps(1) flags to get the scheduling class, priority, and LWP ID for a process listing.
# ./ptlwp & 13042 # Pthread ID: 4, LWP ID: 1 # ps -Lc PID LWP CLS PRI TTY LTIME CMD 13043 1 TS 48 pts/3 0:00 ps 616 1 TS 48 pts/3 0:00 sh 13042 1 RT 100 pts/3 0:03 ptlwp 13042 2 TS 58 pts/3 0:00 ptlwp 13042 3 TS 58 pts/3 0:00 ptlwp # ps -Lc PID LWP CLS PRI TTY LTIME CMD 13044 1 TS 58 pts/3 0:00 ps 616 1 TS 48 pts/3 0:00 sh 13042 1 RT 100 pts/3 0:06 ptlwp 13042 2 TS 58 pts/3 0:00 ptlwp 13042 3 TS 58 pts/3 0:00 ptlwp # kill 13042 13042 Terminated #
The above example is the second sample program compiled as a program named ptlwp. We run the program as root, see the printf(3s) statement output the thread ID and LWP ID, then run ps(1) a couple of times. As you can see, LWP ID 1 for our process (PID 13042) is in the realtime class. In the interest of space, we're not showing the output from the first sample program. It will be essentially the same, as our whole purpose here has been to describe how these POSIX scheduling policies actually behave by providing two methods for doing the same thing.
Under POSIX guidelines, support for the SCHED_FIFO and SCHED_RR scheduling policies is optional. Furthermore, where they're implemented, they're supported for bound threads only. As we've seen, Solaris does implement these scheduling policies for POSIX threads that have their contentionscope attribute set to PTHREAD_SCOPE_SYSTEM (i.e., bound threads). This makes sense, as any realtime application should implement bound (versus) unbound threads. This warrants mentioning, because the second test code segment above could be modified to change the contentionscope of the user threads to PTHREAD_SCOPE_PROCESS, and thus create an unbound thread instead of a bound thread.
Because the code to put the LWP in the realtime class exists within the thread itself, the priocntl(2) calls will work when the thread executes. However, because the thread is not bound, there's no guarantee that it will always execute on the same LWP. Therefore, using priocntl(2) to alter LWP scheduling class or priorities from an unbound thread can cause unexpected and unpredictable behavior. It is not supported in Solaris.
The only scheduling policy that remains is the aforementioned SCHED_OTHER, which is the default policy, and which will result in the underlying LWP executing in either the timeshare or interactive class, depending on the scheduling class of the calling thread.
Scheduling parameters inherit attribute
This brings us to the scheduling parameters inherit attribute. This attribute can have one of two possible values: PTHREAD_INHERIT_SCHED or PTHREAD_EXPLICIT_SCHED. The default value is PTHREAD_EXPLICIT_SCHED, which allows the scheduling attributes set in the attributes structure and passed in pthread_create(3thr) to be used. PTHREAD_INHERIT_SCHED, which must be set explicitly by using pthread_attr_setinheritsched(3thr), instructs the thread create to ignore the scheduling values in the attributes structure.
The implementation in the library code is pretty straightforward. In the pthread_create() code, a test is made on the inherit member of the attributes structure. If it's been set for PTHREAD_INHERIT_SCHED, the newly created thread's priority and policy are set based on the values from the calling thread. Otherwise, PTHREAD_EXPLICIT_SCHED behavior is implemented, and the code sets the policy and priority based on the values in the attributes structure, after validating priority ranges and policy values.
That's a wrap for October. Next month, we'll continue our discussion of pthreads with a look at how the dynamic LWP pool is maintained, in addition to some other odds and ends.
Happy Halloween!
Unix Insider