CIS 307: Spinlocks and Semaphores

[TestAndSet], [Spinlocks] [Definition and initial implementation of Semaphores], [Comments], [Nutt's Implementation], [An implementation for testing programs involving semaphores],

TestAndSet

We consider a machine instruction that supports concurrency. Always be conscious of the hardware support for Operating Systems functions. There are a number of reasons for this support. It may be the only way to solve problem and, almost always, it allows solutions that are more efficient than purely software solutions.

We can think of the TestAndSet instruction as a function implemented atomically in hardware. Here is this function:

	int TestAndSet(int *x){
	   register int temp = *x;
	   *x = 1;
	   return temp;
	}

The hardware can achieve this effect as follows: the instruction involves two Processor/Memory bus interactions:

Read from x into a register temp
Write one to x

no access to that memory location x must be allowed (this can be achieved, for instance, by maintaining bus mastery) in between those two interactions.

SpinLocks

Spinlocks are abstract data types, or classes, that support a Busy-Wait solution for the Mutual Exclusion Problem. The operations on spinlocks are InitLock, Lock, and UnLock. If mutex is a spinlock that has been initialized using InitLock, then Critical Regions S1, S2, .., Sn can be protected from each other by preceding each region by a call to Lock and following it by a call to UnLock. The region Si becomes:

	Lock(mutex);
	Si;
	UnLock(mutex);

The pseudo code for implementing SpinLocks follows:


	typedef int SpinLock;

	void InitLock(SpinLock *L)
	{   *L = 0; }

	void Lock(SpinLock *L)
	{   while (TestAndSet(L)) 
		;
	}

	void UnLock(SpinLock *L)
	{   *L = 0; }

As you can see, the implementation is quite simple and with no system overhead since TestAndSet is a user mode instruction. It works because of the properties of the TestAndSet instruction. Notice that this solution involves Busy Waiting.
Here is code for this kind of spinlock that works on the i386 architecture.

A big problem of the spinlock is that by repeatedly testandsetting a location in memory it can use a high percentage of the memory bus bandwidth, thus reducing the utilization of other procesors sharing the memory bus. This location that is repeatedly accessed is called a Hot Spot in the memory. A partial solution (in the case of cache-coherent shared memories) is to change the Lock operation as follows:

	void Lock(SpinLock *L)
	{   while (TestAndSet(L))
               while (*L == 1)
		  ;
	}

which uses less of the memory bandwidth since the TestAndSet instruction involves both a read and a write transaction on the memory while (*L==1) involves only a read operation, and this read operation is normally carried out on a cached copy. [The inner While Loop operates on the cached copy without affecting the shared memory. It assumes that the hardware supports some form of cache coherence algorithm which invalidates the cache copy when the shared memory copy is updated.]

There is no guaranty of Fairness for spinlocks. That is, if two processes P and Q are trying to enter a critical region protected by a locked spinlock mutex, then there is no guaranty that they will be serviced in FIFO order. Their order of entry will be random. More serious is the problem that the process P may use the spinlock within a loop and use the lock repeatedly while Q is unable to acquire it. [How can this happen?] This behavior could be avoided by giving a "Nice" behavior to the UnLock operation: when the Unlock is executed it is checked if another process is waiting and if yes the current process defers (gives up control of CPU) to it. Of course, in most situation this is an unwarranted precaution.

Spinlocks also are dangerous to use in uniprocessor systems because of the Priority Inversion Problem. [A low priority process is in the critical region. A high priority process tries to enter the region. The low priority process cannot get the CPU which is held by the high priority process. The high priority process spins in the spinlock trying to enter the region being held by the low priority process. We have a deadlock. ]

Two possible solutions to the priority inversion problem:

Determine the maximum priority among the processes that can execute these critical regions, then make sure that all processes execute the critical regions at this priority. Since now competing processes have the same priority, they cannot preempt each other.
Execute the critical regions with masked interrupts. Thus the processor of the current process cannot be preempted.

A related problem is the formation of scheduling convoys. This occurs when a process ceases to run in the middle of a critical region [because of page fault, quantum expiration, external interrupt] and then all other processes sharing this region have to wait for its completion. A number of techniques are being introduced to deal with this problem, in particular Preemption-safe Locking and Non-blocking Algorithms. [Preemption-safe locking tries to avoid preemption during a critical region by having the kernel extend the quantum of a process in the critical region or by increasing its priority. Non-blocking algorithms, expecially for implementing shared counters and queues, make sure that if a process blocks within a critical region, it can be brought back to the beginning of the critical region and another process is allowed to enter the region.]

In general spinlocks with their busy waiting mechanism are desirable (as compared to non busy waiting mechanisms like semaphores) when the scheduling and context switching time are greater than the time required to execute the critical region, or when there are processors not required to do other tasks.

The efficient implementation of spinlocks is of great concern because spinlocks are the usual foundation for synchronization, expecially in multiprocessor systems. They are used everywhere and very frequently. As the use of multiprocessor systems increases so will the interest in efficient spinlocks and in alternative synchronization mechanisms

Definition and initial implementation of Semaphores

People use the term "Semaphore" to refer to a variety of synchronization mechanisms. Here by "Semaphore" we mean the "non-busy-waiting" analog of SpinLocks. We show a possible implementation for semaphores. For now assume that semaphore's operations are atomic.


	type semaphore = record
			    count : integer;
  1.			    q     : queue of processes;
			 end record;
	procedure initsem (s : in out semaphore; n : integer) is
	begin
	   s.count := n; initialize s.q to the empty queue;
	end;
	procedure P(s : in out semaphore) is
	begin
  2.	   s.count := s.count - 1;
  3.	   if s.count < 0 then 
  4.	      Remove this process from Running, add it 
	              to s.q, and reschedule;
	end;
	procedure V(s : in out semaphore) is
	begin
  5.	   s.count := s.count + 1;
  6.	   if s.count <= 0 and s.q is not empty then
  7.	      Remove a process from s.q, add it to the Ready 
		      Queue, and reschedule;
	end;

Comments

On line 1. we talk of a "queue of processes". A process is usually represented by a process id, a pointer or an index that refers to a Process Control Block (PCB). A PCB contains the relevant information about a process state.
Lines 2., 3., and 4. form a critical region. The same is true for lines 5., 6., and 7.. As such they need to be enforced either by masking interrupts (in single processor systems) or by using spinlocks.
The P operation is also called Down or Wait. The V operation is also called Up or Signal.
Reschedule operates as follows: say A is the highest priority process in the Ready queue (you may assume that there is always one such process, at least a Null process).
- If A has higher priority than the currently running process B, move B to the Ready queue and A to Running.
- If A does not have higher priority than the currently running process B, do nothing.
- If there is no running process, move A to Running.
In all cases, resume the running process.
If initsem is called with the parameter n set to 0, we say that we have a Blocking Semaphore; if with 1, we say that we have a Boolean or Mutual Exclusion Semaphore; and if n is greater than 1, we say that we have a Counting Semaphore.
Boolean semaphores are the non-busy-waiting analogs of spinlocks.

Using semaphores we can achieve a number of different behaviors:

If S is a boolean semaphore, then we can implement related critical regions by prefixing them with the statement P(S); and affixing them with V(S);
If S is a counting semaphore initialized to n (say, 10), then up to n (10) different processes could be executing a region enclosed by P(S)..V(S) statements.
If S is a blocking semaphore, then we can make sure that a process completes execution of statement A before another process starts execution a statement B, by following A with the statement V(S) and preceding B with the statement P(S). This is for example one way of implementing the arc semantics in Precedence Graphs.

Semaphores do not suffer the same form of the Priority Inversion Problem as Spinlocks. You may see the kind of Priority Inversion Problem that can arise for Semaphores in the following note about the Mars Rover. We have a high priority process communicating with a low priority process using a small critical region. While the low priority process is in the critical region the high priority process waits in the semaphore protecting the region. If an intermediate, independent process starts running at this point, it preempts the low priority process and prevents the higher priority job to get on its important way.

Nutt's Implementation

Here we show code similar to the code used by G.J.Nutt in his Operating Systems book. This code implements N semaphores for a multiprocessing system.

#define	 N   ..         /* The number of semaphores */

int      value[N];	/* Keeps the count for the semaphores   */
boolean  sem[N];	/* Spinlocks to protect critical region */
			/*   of semaphore */
boolean  hold[N];	/* Used to temporarily "enqueue" tasks  */
			/* waiting on sem */

void initsem(int s, m) {
	value[s] = m; sem[s] = FALSE; hold[s] = TRUE;}
void P(int s) {
	while (TS(sem[s]));	/* Lock */
	value[s]--;
	if (value[s] < 0) {
	   sem[s] = FALSE;	/* Unlock */
	   while(TS(hold[s])) yield(Q); /* Here the process enters */
			/* a new Critical Region where it gives    */
			/* control to Q, a scheduler. It is a loop */
			/* since we may repeatedly be given control*/
			/* back from yield and find a blocked semaphore.*/
	}else
	   sem[s] = FALSE;}	/* Unlock */
void V(int s){
	while (TS(sem[s]));	/* Lock */
	value[s]++;
	if (value[s] <= 0){	/* Somebody is waiting in hold to enter CR*/
	   while (!hold[s]);	/* We check if there is a concurrent V. */
				/* It is a loop */
				/* because we might otherwise have two or more */
				/* consecutive V operations all setting (see next */
				/* statement) hold[s] to false. */
	   hold[s] = FALSE;	/* Here we free a process waiting in P,*/
				/* namely on hold[s].  */
	}
	sem[s] = FALSE;}	/* Unlock */

An implementation for testing programs that use semaphores

Here is code that implements semaphores as pipes. Of course nobody would ever think to use such an implementation in practice. This code is shown here for two reasons. Firstly, it can be used to test concurrent programs with semaphores. Second, it reminds us that IPC primitives can be implemented in terms of each other. That is, IPC primitives are chosen not just on the basis of functionality, but also on the basis of performance (things like data rate and latency).

/* sem.h - CreateSem, P, V, DelSem */

typedef int *semaphore;  

semaphore CreateSem (int count);  /* Returns a new semaphore */

void P(semaphore s);              /* P, or wait, or down on semaphore */

void V(semaphore s);              /* V, or signal, or up on semaphore */

int DelSem(semaphore s);          /* Deletes an existing semaphore */

/* sem.c - CreateSem, P, V, DelSem */

/*
 This implementation of semaphores is strictly for kicks since
 it assumes the availability of pipes!
 A semaphore is represented by a pipe. The P operation is simulated
 by reading a character from the pipe. The V operation is simulated
 by writing a character to the pipe. The kind of the semaphore is
 specified when the semaphore is created. If the argument is 0, we get a
 blocking semaphore, if 1, a boolean semaphore, and if n>1, a counting
 semaphore.
 */

#include <stdio.h>
#include <sys/types.h>
#include "sem.h"

semaphore CreateSem (int count){
/* Create and return a new semaphore */
   int *fd;
   int i;

   fd = (int *)malloc(8); /* Allocating space for two filedescriptors */
   if (pipe(fd) < 0) {
      printf("Cannot create pipe.\n");
      exit(0);}
   for(i=0;i < count;i++)
      write(fd[1], "x", 1);
   return(fd);
}

void P(semaphore s){
   char c;
   read(s[0], &c, 1); 
}

void V(semaphore s){
   char c;
   write(s[1], &c, 1); 
}

int DelSem(semaphore s){
/* Delete an existing semaphore */
   close(s[0]);
   close(s[1]);
   free(s);
}

ingargio@joda.cis.temple.edu