Os 4
Os 4
1 Overview
○ Thread Definition
● 4.1.1 Motivation
○ Multithreaded Applications
○ Multicore Systems
○ Web Servers
● 4.1.2 Benefits
○ Responsiveness
○ Resource Sharing
○ Economy
○ Scalability
○ Multicore Systems
○ Identifying Tasks
○ Balance
○ Data Splitting
○ Data Dependency
○ Data Parallelism
○ Task Parallelism
○ Description
○ Advantages
○ Disadvantages
○ Description
○ Advantages
○ Disadvantages
○ Description
○ Advantages
○ Disadvantages
○ Overview
● 4.4.1 Pthreads
○ Key Functions
○ Attributes
○ Key Functions
○ Thread Creation
○ Synchronization
● 4.4.3.1 Java Executor Framework
○ ExecutorService
○ Overview
● 4.5.2 Fork-Join
○ Fork-Join in Java
● 4.5.3 OpenMP
○ Example
○ GCD Features
○ Example (Swift)
● 4.8 Summary
4.1 Overview
● Thread Definition: A thread is a unit of CPU utilization consisting of a thread ID, program counter (PC),
register set, and a stack. It shares resources like code, data, and open files with other threads in the same
process.
● Single-threaded vs. Multithreaded Process: A single-threaded process has one thread, while a
multithreaded process has multiple threads, enabling it to perform more tasks simultaneously.
4.1.1 Motivation
● Multithreaded Applications: Most modern applications use multiple threads to perform tasks like:
○ A web browser running tasks like displaying images and fetching data in parallel.
○ A word processor using threads for UI, spelling checks, and background operations.
● Multicore Systems: Multithreading helps leverage multicore systems by splitting tasks across multiple
CPUs.
● Web Servers: Instead of creating new processes for each request, a multithreaded server creates threads
to handle multiple requests simultaneously.
4.1.2 Benefits
1. Responsiveness: Multithreading allows applications to remain responsive, even when performing lengthy
operations in the background (e.g., responsive UIs).
2. Resource Sharing: Threads within a process share the same memory and resources, making
communication easier and more efficient.
3. Economy: Threads are cheaper to create and switch between than processes, saving memory and time.
4. Scalability: Multithreading allows applications to run parallel tasks on multiple processors, improving
performance on multicore systems.
● Multicore Systems: These systems have multiple processing cores on a single chip. Each core appears as
a separate CPU to the operating system, making them ideal for multithreaded programming, which
maximizes the use of these cores.
○ Concurrency: Multiple tasks make progress over time but not simultaneously (e.g., on a
single-core system).
○ Parallelism: Tasks are performed simultaneously on multiple cores (e.g., multicore system).
1. Identifying Tasks: Find independent tasks in the program that can be divided and executed in parallel.
2. Balance: Ensure tasks perform equal work to avoid inefficient use of resources.
3. Data Splitting: Data must also be divided across cores to ensure efficient parallel processing.
4. Data Dependency: Tasks that depend on shared data must be synchronized to avoid conflicts.
5. Testing and Debugging: Parallel programs are harder to test due to numerous possible execution paths.
These challenges make it necessary for software development to adapt to parallel programming and multicore
architectures.
2. Task Parallelism: Involves dividing tasks (not data) across multiple cores. Each thread performs a unique
operation, and the tasks may operate on the same or different data.
While data and task parallelism are different, they can be combined to create a hybrid approach.
Summary
Multicore programming takes advantage of multiple processing cores to perform tasks concurrently or in parallel. The
shift from single-core to multicore systems presents programming challenges such as identifying tasks, balancing
workload, and managing data dependencies. Two main types of parallelism—data and task parallelism—help
distribute workload efficiently across cores. Despite the benefits, programming for multicore systems remains
complex, requiring new approaches to software design and debugging.
Threads can be supported in two main ways: user threads (managed at the user level without kernel support) and
kernel threads (managed directly by the operating system). The relationship between user threads and kernel
threads can follow different models: many-to-one, one-to-one, and many-to-many.
● Advantages:
● Disadvantages:
○ If one thread blocks (e.g., making a system call), the entire process blocks.
○ Cannot utilize multiple cores in a multicore system, as only one thread can access the kernel at a
time.
● Advantages:
○ Better concurrency as other threads can continue when one makes a blocking system call.
● Disadvantages:
○ Too many threads may degrade system performance due to overhead in kernel thread
management.
● Advantages:
○ Can create many user threads without the limitations of the many-to-one model.
○ Supports parallelism on multiple cores and can schedule different kernel threads for execution
when one thread makes a blocking system call.
● Disadvantages:
● Variation: The two-level model is a variation where user threads can be bound to specific kernel threads,
enhancing flexibility.
Summary
● The many-to-one model is rarely used today due to its limitations with blocking and multicore support.
● The one-to-one model provides better concurrency and parallelism but can be limited by the overhead of
creating too many kernel threads.
● The many-to-many model is the most flexible, allowing many user threads to be managed by a smaller
number of kernel threads and running in parallel on multiple cores, though it is complex to implement.
● Most modern systems prefer the one-to-one model for simplicity and performance, though some
concurrency libraries still utilize the many-to-many model for specialized tasks.
Thread libraries provide an API for creating and managing threads. There are two primary ways to implement these
libraries:
1. User-level thread libraries: Managed entirely in user space, with no kernel support.
2. Kernel-level thread libraries: Supported and managed by the operating system, typically requiring system
calls.
● Java Threads: The Java thread API, which relies on the underlying OS thread library.
Threads in all these libraries can share data and resources, but Java requires explicit management of shared data.
4.4.1 Pthreads
Pthreads is a POSIX standard that provides a set of functions for thread creation and synchronization. Pthreads can
be implemented at either the user level or kernel level.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
int sum; /* shared data between threads */
● Attributes: Threads can be configured using attributes like stack size and scheduling policy.
#include <windows.h>
#include <stdio.h>
● Thread Creation: Create a new thread with new Thread(new Task()), and call start() to execute.
import java.util.concurrent.*;
● ExecutorService: Manages thread pool and tasks, using submit() for task execution.
● Callable and Future: Callable allows tasks to return results, and Future.get() retrieves the result once
it's available.
Summary
Thread libraries provide essential tools for managing and synchronizing threads. Pthreads and Windows Threads
allow direct control over thread creation and management, while Java simplifies this with its built-in thread model and
the Executor Framework. Java's Callable and Future classes extend the flexibility of thread management,
especially for concurrent tasks that return results. These thread libraries provide the foundation for creating
multithreaded programs across different operating systems and environments.
Implicit threading simplifies the creation and management of threads by transferring the responsibility to compilers or
runtime libraries. Instead of directly managing threads, developers focus on identifying parallel tasks—functions that
can run concurrently. The library or framework then manages thread creation and execution, often using a
many-to-many model. This approach alleviates the complexity of managing hundreds or thousands of threads,
particularly in multicore systems.
1. Faster Execution: Reusing existing threads is faster than creating new ones.
2. Thread Limitation: Limits the number of concurrent threads, preventing resource exhaustion.
3. Task Scheduling: Tasks can be scheduled for delayed or periodic execution.
● QueueUserWorkItem() is used to add tasks to the thread pool, where the function is executed by a thread
from the pool.
Example:
ExecutorService pool = Executors.newCachedThreadPool();
pool.execute(new Task());
pool.shutdown();
●
4.5.2 Fork-Join
The fork-join model involves breaking tasks into smaller subtasks (forking), executing them in parallel, and then
combining the results (joining). This model is synchronous, where the parent thread waits for all children to complete
before continuing.
Fork-Join in Java:
● Introduced in Java 1.7, the ForkJoinPool handles divide-and-conquer tasks, such as sorting algorithms.
● Tasks are split recursively, and when the problem size is small enough, the task is solved directly.
● Work stealing: Idle threads can steal tasks from other threads to balance the workload.
Example:
4.5.3 OpenMP
OpenMP is a set of compiler directives for parallel programming in C, C++, and Fortran, primarily for shared-memory
systems. Developers insert directives into the code to specify parallel regions, and OpenMP takes care of creating
and managing threads.
Example:
● Parallelizing loops: OpenMP can parallelize for loops automatically, dividing iterations among available
threads.
GCD Features:
1. Serial Queues: Execute tasks one at a time, ensuring sequential order.
2. Concurrent Queues: Execute multiple tasks in parallel.
3. Quality of Service (QoS): Prioritizes tasks into categories like user-interactive, utility, and background
tasks.
Example (Swift):
TBB divides the iteration space into chunks and assigns them to threads, similar to the fork-join model. The library
manages tasks and ensures optimal performance.
Summary
Implicit threading frameworks like thread pools, fork-join, OpenMP, GCD, and Intel TBB abstract away thread
creation and management, allowing developers to focus on identifying parallel tasks. These models help optimize
task execution, reduce thread overhead, and balance workload across multiple cores. By using implicit threading,
developers can create efficient parallel applications without dealing directly with the complexities of thread
management.
1. Duplicate all threads: Some systems provide a version of fork() that duplicates all threads.
2. Duplicate only the calling thread: This is a version where only the thread calling fork() is
duplicated.
● exec(): If exec() is called after fork(), the new process replaces the entire process, including all
threads.
● If exec() is not called, it is better to duplicate all threads to maintain the multithreaded state.
1. Deliver to the specific thread causing the issue (for synchronous signals).
● Windows: Emulates signals using Asynchronous Procedure Calls (APCs), which are delivered to specific
threads.
○ Deferred Cancellation: The thread checks periodically for a cancellation request and terminates at
a safe point.
● Deferred Cancellation: Preferred as it ensures threads can terminate safely at appropriate points.
● Java Example: Uses Thread.interrupt() to set an interrupt flag for the target thread, which can check
using isInterrupted().
4.6.4 Thread-Local Storage (TLS)
● TLS: Allows each thread to have its own copy of certain data, preventing shared access issues.
○ Use Case: For example, each transaction in a system may require its own transaction ID stored in
TLS.
○ Lightweight Process (LWP): A virtual processor for user threads, which is mapped to a kernel
thread.
○ Upcalls: Used to inform the user-level thread library when a thread is about to block, allowing it to
manage available threads and efficiently schedule tasks.
Summary
● fork() and exec() in multithreaded programs behave differently, with the need to duplicate either all
threads or only the calling thread, depending on the situation.
● Signal handling in multithreaded programs requires careful delivery decisions, as signals can be sent to
specific threads, all threads, or one designated thread.
● Thread cancellation can be done asynchronously (immediate termination) or deferred (safe, periodic
checks).
● Thread-local storage (TLS) ensures that each thread has its own copy of certain data to avoid conflicts,
particularly in systems like Java, Pthreads, and C#.
● Scheduler activations provide coordination between the kernel and user-level threads to ensure efficient
use of resources and smooth execution, especially in complex threading models.
● User and Kernel Stacks: Used when the thread is running in user mode (user stack) or kernel mode (kernel
stack).
● Private Storage Area: Used by runtime libraries and dynamic link libraries (DLLs).
1. ETHREAD (Executive Thread Block): Holds pointers to the process and the starting address of the thread's
routine. It also contains a pointer to the KTHREAD.
2. KTHREAD (Kernel Thread Block): Contains kernel-specific scheduling and synchronization information,
including the kernel stack.
3. TEB (Thread Environment Block): A user-space data structure holding the thread identifier, user stack, and
thread-local storage when the thread is in user mode.
Thread Structure:
● The ETHREAD and KTHREAD exist in kernel space, and only the kernel can access them. The TEB exists
in user space and is accessed when the thread is in user mode.
● ETHREAD and KTHREAD in kernel space, while the TEB resides in user space, managing the thread's
execution environment.
When clone() is called with certain flags, the parent and child tasks share various resources, making it equivalent to
thread creation. If no flags are set, the child task is like a new process created by fork().
● Each task in Linux has a unique kernel data structure: struct task_struct. This structure holds pointers to
other data structures (e.g., memory, open files, signals).
● When fork() is used, a new task is created along with copies of the parent process's data structures.
However, clone() allows selective sharing of data structures, depending on the flags passed.
● The clone() system call can be extended to create Linux containers, which isolate tasks (Linux systems)
under a single kernel, providing virtualization. Containers are discussed further in Chapter 18.
Summary
● Windows uses a one-to-one thread mapping where each user-level thread maps directly to a kernel
thread, with specific thread data structures like ETHREAD, KTHREAD, and TEB for managing execution in
user and kernel space.
● Linux treats processes and threads as tasks, using clone() to create tasks with varying degrees of
resource sharing between parent and child tasks. The task_struct structure is used to manage these tasks,
and clone() can also be used to create containers for virtualization.
4.8 Summary
● Thread Definition: A thread is a basic unit of CPU utilization. Threads within the same process share
resources like code and data.
● Benefits of Multithreading:
○ Responsiveness: Allows continued operation even if part of the program is blocked or performing
lengthy operations.
○ Resource Sharing: Threads share resources, making it easier to communicate and share data.
○ Economy: Threads are more lightweight compared to processes, reducing the overhead of
creation and switching.
○ Scalability: Multithreaded applications can leverage multiple processors for parallel execution.
● Concurrency vs. Parallelism:
○ Concurrency: Multiple threads making progress over time (possible on single CPU systems).
● Challenges in Multithreading:
● Parallelism Types:
○ Data Parallelism: Distributes data subsets across cores, performing the same operation on each
core.
● Thread Models:
○ Many-to-Many: Multiple user threads mapped to a smaller or equal number of kernel threads.
● Thread Libraries:
○ Windows, Pthreads (for POSIX systems like UNIX, Linux, macOS), and Java provide APIs for
thread creation and management.
● Implicit Threading: Developers identify tasks instead of threads, and the system (or libraries) handles
thread creation and management. Examples include thread pools, fork-join, and Grand Central Dispatch.
● Thread Cancellation:
○ Unlike other OSs, Linux doesn't distinguish between processes and threads, using the term task.
The clone() system call can create tasks that behave either like processes or threads based on
resource sharing settings.
This chapter highlights how multithreading provides performance improvements through parallel execution, how
various models and libraries manage thread behavior, and the issues that arise in creating multithreaded applications.