0% found this document useful (0 votes)
72 views17 pages

Module 2 Detailed Multi-Rocessor Scheduling Ch-10

Chapter 10 discusses multiprocessor and real-time scheduling, focusing on classifications of multiprocessor systems based on synchronization granularity and scheduling issues. It outlines various approaches to process and thread scheduling, including load sharing, gang scheduling, dedicated processor assignment, and dynamic scheduling, emphasizing their advantages and disadvantages. Additionally, it covers the characteristics and requirements of real-time systems, including determinism, responsiveness, user control, reliability, and fail-soft operation, along with real-time scheduling algorithms.

Uploaded by

All TechFact
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views17 pages

Module 2 Detailed Multi-Rocessor Scheduling Ch-10

Chapter 10 discusses multiprocessor and real-time scheduling, focusing on classifications of multiprocessor systems based on synchronization granularity and scheduling issues. It outlines various approaches to process and thread scheduling, including load sharing, gang scheduling, dedicated processor assignment, and dynamic scheduling, emphasizing their advantages and disadvantages. Additionally, it covers the characteristics and requirements of real-time systems, including determinism, responsiveness, user control, reliability, and fail-soft operation, along with real-time scheduling algorithms.

Uploaded by

All TechFact
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Chapter 10: Multiprocessor and Real-Time Scheduling Module: 2/2

Classifications of
Multiprocessor
Systems

Granularity:
A good way of characterizing multiprocessors and placing them in context with other architectures is to
consider the synchronization granularity, or frequency of synchronization, between processes in a system. We
can distinguish five categories of parallelism that differ in the degree of granularity. These are summarized in
Table 10.1 , which is adapted from [GEHR87] and [WOOD89].
Synchronization Granularity and Processes

Synchronization Interval
Grain Size Description
Independent
Parallelism
,w
gtvd
n
m
fo
rip
u
eb
lsach
No explicit
synchronization among
processes
each represents a
separate, independent
application or job

Typical use is in a time-


sharing system

Design Issues:
Scheduling on a multiprocessor involves three interrelated issues:
• The assignment of processes to processors
• The use of multiprogramming on individual processors
• The actual dispatching of a process
It is important to keep in mind that the approach taken will depend, in general, on the degree of granularity
of the applications and on the number of processors available.
Assignment of Processes to Processors:
• Assuming all processors are equal, it is simplest to treat processors as a pooled resource and assign
processes to processors on demand
• static or dynamic needs to be determined
• If a process is permanently assigned to one processor from activation until its completion, then a
dedicated short-term queue is maintained for each processor
• advantage is that there may be less overhead in the scheduling function
• allows group or gang scheduling
Assignment of Processes to Processors:
Regardless of whether processes are dedicated to processors, some means is needed to assign processes to
processors. Two approaches have been used: master/slave and peer.
Master/Slave
Architecture
Key kernel functions always run on a particular
processor

Master is responsible for scheduling

Slave sends service request to the master

Is simple
tge:
vn
isad
D and requires little enhancement to a
uniprocessor multiprogramming operating system

Conflict resolution is simplified because one

Peer Architecture
processor has control of all memory and I/O
resources

Kernel can execute on any processor

Each Com
gyprocessor does self-scheduling from the pool of
n
rti
licatesh
p
available processes

Process Scheduling:
 Usually processes are not dedicated to processors
 A single queue is used for all processors
 if some sort of priority scheme is used, there are multiple queues based on priority
 System is viewed as being a multi-server queuing architecture
 With static assignment: should individual processors be multiprogrammed or should each be
dedicated to a single process?
 Often it is best to have one process per processor; particularly in the case of multithreaded programs
where it is advantageous to have all threads of a single process executing at the same time.
Thread Scheduling:
 Thread execution is separated from the rest of the definition of a process
 An application can be a set of threads that cooperate and execute concurrently in the same address
space
 On a uniprocessor, threads can be used as a program structuring aid and to overlap I/O with
processing
 In a multiprocessor system kernel-level threads can be used to exploit true parallelism in an
application
 Dramatic gains in performance are possible in multi-processor systems
 Small differences in thread management and scheduling can have an impact on applications that
require significant interaction among threads
Approaches to Thread Scheduling:
Among the many proposals for multiprocessor thread scheduling and processor assignment, four general
approaches stand out:
1. Load sharing : Processes are not assigned to a particular processor. A global queue of ready threads is
maintained, and each processor, when idle, selects a thread from the queue. The term load sharing is used to
distinguish this strategy from load-balancing schemes in which work is allocated on a more permanent basis
(e.g., see [FEIT90a]).

2. Gang scheduling : A set of related threads is scheduled to run on a set of processors


at the same time, on a one-to-one basis.
3. Dedicated processor assignment: This is the opposite of the load-sharing approach and provides implicit
scheduling defined by the assignment of threads to processors. Each program, for the duration of its
execution, is allocated a number of processors equal to the number of threads in the program. When the
program terminates, the processors return to the general pool for possible allocation to another program.
4. Dynamic scheduling: The number of threads in a process can be altered during the course of execution.
Load Sharing: Simplest approach and carries over most directly from a uniprocessor environment.
Advantages:
• load is distributed evenly across the processors
• no centralized scheduler required
• the global queue can be organized and accessed using any of the schemes
Versions of load sharing:
• first-come-first-served
• smallest number of threads first
• preemptive smallest number of threads first
Disadvantages of Load Sharing:
 Central queue occupies a region of memory that must be accessed in a manner that enforces mutual
exclusion
 can lead to bottlenecks
 Preemptive threads are unlikely to resume execution on the same processor
 caching can become less efficient
 If all threads are treated as a common pool of threads, it is unlikely that all of the threads of a program
will gain access to processors at the same time
 the process switches involved may seriously compromise performance
Gang Scheduling: Simultaneous scheduling of the threads that make up a single process
 Useful for medium-grained to fine-grained parallel applications whose performance severely degrades
when any part of the application is not running while other parts are ready to run
 Also beneficial for any parallel application
Benefits:
 synchronization blocking may be reduced, less process switching may be necessary, and
performance will increase
 scheduling overhead may be reduced
Figure 10.2: Example of Scheduling Groups with Four and One Threads
The use of gang scheduling creates a requirement for processor allocation. One possibility is the following.
Suppose that we have N processors and M applications, each of which has N or fewer threads. Then each
application could be given 1/ M of the available time on the N processors, using time slicing. [FEIT90a] notes
that this strategy can be inefficient.
Consider an example in which there are two applications, one with four threads and one with one thread.
Using uniform time allocation wastes 37.5% of the processing resource, because when the single-thread
application runs, three processors are left idle (see Figure 10.2 ). If there are several one-thread applications,
these could all be fit together to increase processor utilization.
If that option is not available, an alternative to uniform scheduling is scheduling that is weighted by the
number of threads. Thus, the four-thread application could be given four-fifths of the time and the one-
thread application given only one-fifth of the time, reducing the processor waste to 15%.

Dedicated Processor Assignment:


An extreme form of gang scheduling,suggested in [TUCK89], is to dedicate a group of processors to an
application for the duration of the application. That is, when an application is scheduled, each of its threads is
assigned a processor that remains dedicated to that thread until the application runs to completion.
This approach would appear to be extremely wasteful of processor time. If a thread of an application is
blocked waiting for I/O or for synchronization with another thread, then that thread’s processor remains idle:
there is no multiprogramming of processors.
Two observations can be made in defense of this strategy:
1. In a highly parallel system, with tens or hundreds of processors, each of which
represents a small fraction of the cost of the system, processor utilization is no longer so important as a
metric for effectiveness or performance. 2. The total avoidance of process switching during the lifetime of a
program should result in a substantial speedup of that program.

Figure 10.4 shows the speedup for the applications as the number of threads executing the tasks in each
application is varied from 1 to 24. For example, we see that when both applications are started
simultaneously with 24 threads each, the speedup obtained, compared to using a single thread for each
application, is 2.8 for matrix multiplication and 2.4 for FFT. The figure shows that the performance of both
applications worsens considerably when the number of threads in each application exceeds eight and thus
the total number of processes in the system exceeds the number of processors. Furthermore, the larger the
number of threads the worse the performance gets, because there is a greater frequency of thread
preemption and rescheduling. This excessive preemption results in inefficiency from many sources, including
time spent waiting for a suspended thread to leave a critical section, time wasted in process switching, and
inefficient cache behavior.
The authors conclude that an effective strategy is to limit the number of active threads to the number of
processors in the system. If most of the applications are either single thread or can use the task-queue
structure, this will provide an effective and reasonably efficient use of the processor resources. Both
dedicated processor assignment and gang scheduling attack the scheduling problem by addressing the issue
of processor allocation.
Dynamic Scheduling:
 For some applications it is possible to provide language and system tools that permit the number of
threads in the process to be altered dynamically
 this would allow the operating system to adjust the load to improve utilization
 Both the operating system and the application are involved in making scheduling decisions
 The scheduling responsibility of the operating system is primarily limited to processor
allocation
 This approach is superior to gang scheduling or dedicated processor assignment for
applications that can take advantage of it

Real-Time Systems
Real-time computing is becoming an increasingly important discipline. The operating system, and in particular
the scheduler, is perhaps the most important component of a real-time system.
Real-time computing may be defined as that type of computing in which the correctness of the system
depends not only on the logical result of the computation but also on the time at which the results are
produced. We can define a real-time system by defining what is meant by a real-time process, or task.
In general, in a real-time system, some of the tasks are real-time tasks, and these have a certain degree of
urgency to them. Such tasks are attempting to control or react to events that take place in the outside world.
Because these events occur in “real time,” a real-time task must be able to keep up with the events with
which it is concerned.
• Examples:
• control of laboratory experiments
• process control in industrial plants
• robotics
• air traffic control
• telecommunications
• military command and control systems
Hard and Soft Real-Time Tasks
Hard real-time
Soft real-time task
task
one that must meet its deadline Has an associated deadline that is
desirable but not mandatory
otherwise it will cause
unacceptable damage or a fatal It still makes sense to schedule
error to the system and complete the task even if it
has passed its deadline
Periodic and Aperiodic Tasks:
Another characteristic of real-time tasks is whether they are periodic or aperiodic.
An aperiodic task has a deadline by which it must finish or start, or it may have a constraint on both start
and finish time. In the case of a periodic task , the requirement may be stated as “once per period T ” or
“exactly T units apart.”
Characteristics of Real Time Systems
Real-time operating systems can be characterized as having unique requirements in five general areas:
• Determinism
• Responsiveness
• User control
• Reliability
• Fail-soft operation
Determinism:
An operating system is deterministic to the extent that it performs operations at fixed, predetermined times
or within predetermined time intervals. When multiple processes are competing for resources and processor
time, no system will be fully deterministic.
In a real-time operating system, process requests for service are dictated by external events and timings. The
extent to which an operating system can deterministically satisfy requests depends first on the speed with
which it can respond to interrupts and, second, on whether the system has sufficient capacity to handle all
requests within the required time.
One useful measure of the ability of an operating system to function deterministically is the maximum delay
from the arrival of a high-priority device interrupt to when servicing begins. In non-real-time operating
systems, this delay may be in the range of tens to hundreds of milliseconds, while in real-time operating
systems that delay may have an upper bound of anywhere from a few microseconds to a millisecond.

Responsiveness:
A related but distinct characteristic is responsiveness . Determinism is concerned with how long an operating
system delays before acknowledging an interrupt. Responsiveness is concerned with how long, after
acknowledgment, it takes an operating system to service the interrupt. Aspects of responsiveness include the
following:
• amount of time required to initially handle the interrupt and begin execution of the interrupt
service routine (ISR)
• amount of time required to perform the ISR
• effect of interrupt nesting
User Control:
User control is generally much broader in a real-time operating system than
in ordinary operating systems. In a typical non-real-time operating system, the user either has no control over
the scheduling function of the operating system or can only provide broad guidance, such as grouping users
into more than one priority class. In a real-time system, however, it is essential to allow the user fine-grained
control over task priority. The user should be able to distinguish between hard and soft tasks and to specify
relative priorities within each class.
A real-time system may also allow the user to specify such characteristics as the use of paging or process
swapping, what processes must always be resident in main memory, what disk transfer algorithms are to be
used, what rights the processes in various priority bands have, and so on.
Reliability:
 More important for real-time systems than non-real time systems
 Real-time systems respond to and control events in real time so loss or degradation of performance
may have catastrophic consequences such as:
 financial loss
 major equipment damage
 loss of life
Fail-Soft Operation:
 A characteristic that refers to the ability of a system to fail in such a way as to preserve as much
capability and data as possible
 Important aspect is stability
 a real-time system is stable if the system will meet the deadlines of its most critical,
highest-priority tasks even if some less critical task deadlines are not always met

Real Time Scheduling:


Figure 10.4 illustrates a spectrum of possibilities. In a preemptive scheduler that uses simple round-robin
scheduling, a real-time task would be added to the ready queue to await its next time slice, as illustrated in
Figure 10.4a . In this case, the scheduling time will generally be unacceptable for real-time applications.
Alternatively, in a nonpreemptive scheduler, we could use a priority scheduling mechanism, giving real-time
tasks higher priority. In this case, a real-time task that is ready would be scheduled as soon as the current
process blocks or runs to completion ( Figure 10.4b ). This could lead to a delay of several seconds if a slow,
low-priority task were executing at a critical time. Again, this approach is not acceptable.
A more promising approach is to combine priorities with clock-based interrupts. Preemption points
occur at regular intervals. When a preemption point occurs, the currently running task is preempted if a
higher-priority task is waiting. This would include the preemption of tasks that are part of the operating
system kernel. Such a delay may be on the order of several milliseconds ( Figure 10.4c ). While this last
approach may be adequate for some real-time applications, it will not suffice for more demanding
applications. In those cases, the approach that has been taken is sometimes referred to as immediate
preemption. In this case, the operating system responds to an interrupt almost immediately, unless the
system is in a critical-code lockout section. Scheduling delays for a real-time task can then be reduced to 100
μs or less.
• Scheduling approaches depend on:
• whether a system performs schedulability analysis
• if it does, whether it is done statically or dynamically
• whether the result of the analysis itself produces a scheduler plan according to which tasks are
dispatched at run time
Classes of Real-Time Scheduling Algorithms:
• Static table-driven approaches
• performs a static analysis of feasible schedules of dispatching
• result is a schedule that determines, at run time, when a task must begin execution
• Static priority-driven preemptive approaches
• a static analysis is performed but no schedule is drawn up
• analysis is used to assign priorities to tasks so that a traditional priority-driven preemptive
scheduler can be used
• Dynamic planning-based approaches
• feasibility is determined at run time rather than offline prior to the start of execution
• one result of the analysis is a schedule or plan that is used to decide when to dispatch this task
• Dynamic best effort approaches
• no feasibility analysis is performed
• system tries to meet all deadlines and aborts any started process whose deadline is missed

Deadline Scheduling
 Real-time operating systems are designed with the objective of starting real-time tasks as rapidly as
possible and emphasize rapid interrupt handling and task dispatching
 Real-time applications are generally not concerned with sheer speed but rather with completing (or
starting) tasks at the most valuable times
 Priorities provide a crude tool and do not capture the requirement of completion (or initiation) at the
most valuable time
Information Used for Deadline Scheduling
• Ready time :time task becomes ready for execution
• Starting deadline :time task must begin
• Completion deadline : time task must be completed
• Processing time : time required to execute the task to completion
• Resource requirements : resources required by the task while it is executing
• Priority : measures relative importance of the task
• Subtask scheduler :a task may be decomposed into a mandatory subtask and an
optional subtask

Table 10.2 : Execution Profile of Two Periodic Tasks


Figure 10.5 compares three scheduling techniques using the execution profile of Table 10.2 . The first row of
Figure 10.5 repeats the information in Table 10.2 ; the remaining three rows illustrate three scheduling
techniques. The computer is capable of making a scheduling decision every 10 ms. 4 Suppose that, under
these circumstances, we attempted to use a priority scheduling scheme. The first two timing diagrams in
Figure 10.5 show the result. If A has higher priority, the first instance of task B is only given 20 ms of
processing time, in two 10-ms chunks, by the time its deadline is reached, and thus fails. If B is given higher
priority, then A will miss its first deadline.

The final timing diagram shows the use of earliest-deadline scheduling. At time t = 0 , both A1 and B1 arrive.
Because A1 has the earliest deadline, it is scheduled first. When A1 completes, B1 is given the processor.
At t = 20 , A2 arrives. Because A2 has an earlier deadline than B1, B1 is interrupted so that A2 can execute to
completion. Then B1 is resumed at t = 30 . At t = 40 , A3 arrives. However, B1 has an earlier ending deadline
and is allowed to execute to completion at t = 45 . A3 is then given the processor and finishes at t = 55 .
In this example, by scheduling to give priority at any preemption point to the task with the nearest deadline,
all system requirements can be met. Because the tasks are periodic and predictable, a static table-driven
scheduling approach is used.
Now consider a scheme for dealing with aperiodic tasks with starting deadlines. The top part of Figure 10.6
shows the arrival times and starting deadlines for an example consisting of five tasks each of which has an
execution time of 20 ms.

Table 10.3 : Execution Profile of Five Aperiodic Tasks

Table 10.3 summarizes the execution profile of the five tasks.


A straightforward scheme is to always schedule the ready task with the earliest deadline and let that task run
to completion. When this approach is used in the example of Figure 10.6 , note that although task B requires
immediate service, the service is denied. This is the risk in dealing with aperiodic tasks, especially with starting
deadlines. A refinement of the policy will improve performance if deadlines can be known in advance of the
time that a task is ready.
This policy, referred to as earliest deadline with unforced idle times, operates as follows: Always
schedule the eligible task with the earliest deadline and let that task run to completion. An eligible task may
not be ready, and this may result in the processor remaining idle even though there are ready tasks. Note that
in our example the system refrains from scheduling task A even though that is the only ready task. The result
is that, even though the processor is not used to maximum efficiency, all scheduling requirements are met.
Finally, for comparison, the FCFS policy is shown. In this case, tasks B and E do not meet their deadlines.
Rate Monotonic Scheduling

One of the more promising methods of resolving multitask scheduling conflicts for periodic tasks is rate
monotonic scheduling (RMS). The scheme was first proposed in [LIU73] but has only recently gained
popularity [BRIA99, SHA94].
RMS assigns priorities to tasks on the basis of their periods. For RMS, the highest-priority task is the one with
the shortest period, the second highest-priority task is the one with the second shortest period, and so on.
When more than one task is available for execution, the one with the shortest period is serviced first. If we
plot the priority of tasks as a function of their rate, the result is a monotonically increasing function
( Figure 10.7 ); hence the name, rate monotonic scheduling.
Figure 10.8 illustrates the relevant parameters for periodic tasks. The task’s period, T , is the amount of time
between the arrival of one instance of the task and the arrival of the next instance of the task. A task’s rate (in
hertz) is simply the inverse of its period (in seconds). For example, a task with a period of 50 ms occurs at a
rate of 20 Hz.
Typically, the end of a task’s period is also the task’s hard deadline, although some tasks may have earlier
deadlines. The execution (or computation) time, C , is the amount of processing time required for each
occurrence of the task. It should be clear that in a uniprocessor system, the execution time must be no greater
than the period (must have C … T ). If a periodic task is always run to completion, that is, if no instance of the
task is ever denied service because of insufficient resources, then the utilization of the processor by this task
is U = C/T . For example, if a task has a period of 80 ms and an execution time of 55 ms, its processor
utilization is 55/80 = 0.6875 .

Table 10.4: Value of the RMS Upper Bound


One measure of the effectiveness of a periodic scheduling algorithm is whether or not it guarantees that all
hard deadlines are met. Suppose that we have n tasks, each with a fixed period and execution time. Then for
it to be possible to meet all deadlines, the following inequality must hold:
The sum of the processor utilizations of the individual tasks cannot exceed a value of 1, which corresponds to
total utilization of the processor.
Equation ( 10.1 ) provides a bound on the number of tasks that a perfect scheduling algorithm can
successfully schedule. For any particular algorithm, the bound may be lower. For RMS, it can be shown that
the following inequality holds:
Table 10.4 gives some values for this upper bound. As the number of tasks increases, the scheduling bound
converges to ln 2 0.693 .
As an example, consider the case of three periodic tasks, where Ui = Ci/Ti :
• Task P 1 : C1 = 20 ; T1 = 100 ; U1 = 0.2
• Task P 2 : C2 = 40 ; T2 = 150 ; U2 = 0.267
• Task P 3 : C3 = 100 ; T3 = 350 ; U3 = 0.286

Priority Inversion
 Can occur in any priority-based preemptive scheduling scheme
 Particularly relevant in the context of real-time scheduling
 Best-known instance involved the Mars Pathfinder mission
 Occurs when circumstances within the system force a higher priority task to wait for a lower priority
task
Unbounded Priority Inversion

 the duration of a priority inversion depends not only on the time required to handle a shared resource,
but also on the unpredictable actions of other unrelated tasks

Unbounded Priority Inversion


Our discussion follows that of [TIME02]. The Pathfinder software included the following three tasks, in
decreasing order of priority:
T 1 : Periodically checks the health of the spacecraft systems and software
T 2 : Processes image data
T 3 : Performs an occasional test on equipment status
After T 1 executes, it reinitializes a timer to its maximum value. If this timer ever expires, it is assumed that
the integrity of the lander software has somehow been compromised. The processor is halted, all devices are
reset, the software is completely reloaded, the spacecraft systems are tested, and the system starts over. This
recovery sequence does not complete until the next day. T 1 and T 3 share a common data structure,
protected by a binary semaphore s . Figure 10.9a shows the sequence that caused the priority inversion:
t 1 : T 3 begins executing.
t 2 : T 3 locks semaphore s and enters its critical section.
t 3 : T 1 , which has a higher priority than T 3 , preempts T 3 and begins executing.
t 4 : T 1 attempts to enter its critical section but is blocked because the semaphore is locked by T 3 ; T 3
resumes execution in its critical section.
t 5 : T 2 , which has a higher priority than T 3 , preempts T 3 and begins executing.
t 6 : T 2 is suspended for some reason unrelated to T 1 and T 3 ; T 3 resumes.
t 7 : T 3 leaves its critical section and unlocks the semaphore. T 1 preempts T 3 ,
locks the semaphore and enters its critical section. In this set of circumstances, T 1 must wait for both T 3 and
T 2 to complete and fails to reset the timer before it expires.
In practical systems, two alternative approaches are used to avoid unbounded priority inversion: priority
inheritance protocol and priority ceiling protocol.
Priority Inheritance
The basic idea of priority inheritance is that a lower-priority task inherits the priority of any higher-priority
task pending on a resource they share. This priority change takes place as soon as the higher-priority task
blocks on the resource; it should end when the resource is released by the lower-priority task.
Figure 10.9b shows that priority inheritance resolves the problem of unbounded priority inversion illustrated
in Figure 10.9a . The relevant sequence of events is as follows:
t 1 : T 3 begins executing.
t 2 : T 3 locks semaphore s and enters its critical section.
t 3 : T 1 , which has a higher priority than T 3 , preempts T 3 and begins executing.
t 4 : T 1 attempts to enter its critical section but is blocked because the semaphore is locked by T 3 . T 3 is
immediately and temporarily assigned the same priority as T 1 . T 3 resumes execution in its critical section.
t 5 : T 2 is ready to execute but, because T 3 now has a higher priority, T 2 is unable to preempt T 3 .
t 6 : T 3 leaves its critical section and unlocks the semaphore: its priority level is downgraded to its previous
default level. T 1 preempts T 3 , locks the semaphore, and enters its critical section.
t 7 : T 1 is suspended for some reason unrelated to T 2 , and T 2 begins executing. This was the approach
taken to solving the Pathfinder problem.
In the priority ceiling approach, a priority is associated with each resource.
The priority assigned to a resource is one level higher than the priority of its highest priority user. The
scheduler then dynamically assigns this priority to any task that accesses the resource. Once the task finishes
with the resource, its priority returns to normal.

You might also like