0% found this document useful (0 votes)
47 views8 pages

Quick Sort

Uploaded by

Bezawada Manasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views8 pages

Quick Sort

Uploaded by

Bezawada Manasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Quick sort using Winthreads

unsigned __stdcall QuickSort(qSortIndex *m)


{
int p = m->lo;
int r = m->hi;
if (p < r)
{
qSortIndex s, t;
HANDLE tH[2];
int q = Partition(p, r);
s.lo = p; s.hi = q-1;
tH[0] = (HANDLE)_beginthreadex (NULL, 0, QuickSort, &s, 0, NULL);
t.lo = q+1; t.hi = r;
tH[1] = (HANDLE)_beginthreadex (NULL, 0, QuickSort, &t, 0, NULL);
WaitForMultipleObjects(2, tH, INFINITE);
}
return 0;
}

To convert the provided QuickSort function (which uses Windows threading via
_beginthreadex) to OpenMP, we need to focus on parallelizing the recursive calls to
QuickSort while respecting the structure of the algorithm.

Here's a step-by-step explanation of how to convert the code:

Key Changes:

1. Thread Management:
o The original program uses _beginthreadex and WaitForMultipleObjects to
create and synchronize threads. OpenMP abstracts away much of this complexity
using #pragma omp parallel and #pragma omp task.
2. Recursive Parallelism:
o In OpenMP, recursive functions can be parallelized using tasks. The #pragma
omp task directive allows the creation of tasks (which can be executed in
parallel) for recursive calls.
3. Synchronization:
o OpenMP automatically handles synchronization in recursive functions with its
tasking model, so we no longer need explicit thread synchronization like
WaitForMultipleObjects.
OpenMP
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct {
int lo, hi;
} qSortIndex;

// Partition function (same as in the original code)


int Partition(int lo, int hi) {
int pivot = A[hi]; // Assuming A[] is the array to be sorted
int i = lo - 1;
for (int j = lo; j < hi; j++) {
if (A[j] < pivot) {
i++;
// Swap A[i] and A[j]
int temp = A[i];
A[i] = A[j];
A[j] = temp;
}
}
// Swap A[i+1] and A[hi] (pivot)
int temp = A[i + 1];
A[i + 1] = A[hi];
A[hi] = temp;
return i + 1;
}

// OpenMP QuickSort function


void QuickSort(qSortIndex *m) {
int p = m->lo;
int r = m->hi;

if (p < r) {
int q = Partition(p, r);

qSortIndex s1 = {p, q - 1};


qSortIndex s2 = {q + 1, r};

// Create tasks for the two recursive calls


#pragma omp task shared(s1) if (r - p > 100) // Parallelize if the range is large enough
QuickSort(&s1);

#pragma omp task shared(s2) if (r - p > 100)


QuickSort(&s2);

// Wait for the tasks to finish (synchronization)


#pragma omp taskwait
}
}

int main() {
// Example usage

// Initialize the array


int A[] = {9, 7, 5, 11, 12, 2, 14, 3, 10, 6};
int N = sizeof(A) / sizeof(A[0]);

// Set OpenMP threads to use (optional, can be omitted for automatic management)
omp_set_num_threads(4);

// Start the QuickSort


qSortIndex index = {0, N - 1};

// Use OpenMP to parallelize the QuickSort


#pragma omp parallel
{
#pragma omp single
QuickSort(&index);
}

// Print sorted array


for (int i = 0; i < N; i++) {
printf("%d ", A[i]);
}
printf("\n");
return 0;
}

Explanation of Key Changes:

1. #pragma omp parallel:


o The #pragma omp parallel directive is used to specify that a parallel region
will be executed. In this case, the QuickSort function is called within a parallel
region, and only a single thread will begin the recursion, while others can
execute tasks.
2. Task Parallelism with #pragma omp task:
o The recursive calls to QuickSort are wrapped in #pragma omp task directives.
These directives allow the recursive sub-problems (left and right partitions) to be
executed in parallel. The condition if (r - p > 100) is added to avoid creating
tasks for small sub-arrays, as the overhead of managing parallel tasks for small
ranges might outweigh the benefits.
o OpenMP will automatically decide when to parallelize based on the size of the
problem.
3. #pragma omp taskwait:
o After creating the recursive tasks, we use #pragma omp taskwait to ensure that
the current thread waits for the completion of both tasks (left and right sub-
problems) before continuing. This is necessary to synchronize the parallel tasks
and ensure that the sorting happens correctly.
4. Shared vs. Private Variables:
o The qSortIndex structure (s1 and s2) is passed as a shared variable to ensure
that the tasks can access it. We use shared(s1) and shared(s2) to indicate that
these structures should be available to all tasks.
5. Setting the Number of Threads:
o The number of threads used by OpenMP can be controlled via
omp_set_num_threads(), but this is optional. OpenMP will manage thread pool
creation and task scheduling automatically.

Partition Function:

The Partition function in the OpenMP version remains the same as the original code. It is
responsible for partitioning the array around a pivot element and returning the index of the pivot
after the partitioning step.

Performance Considerations:

 The condition if (r - p > 100) is used to prevent excessive overhead in task creation
for small sub-arrays. QuickSort benefits most from parallelism when the sub-arrays are
sufficiently large, and the task overhead for very small sub-arrays can hurt performance.
 OpenMP works best with larger datasets. For very small arrays, the overhead of
managing threads may reduce performance compared to a sequential QuickSort.

Concurrent iterative version

To convert the given Windows-threaded QuickSort program to OpenMP, we need to handle


several aspects differently because OpenMP abstracts thread management, synchronization, and
communication between threads in a way that is different from Windows threading. Specifically,
OpenMP provides constructs for parallel loops (#pragma omp parallel for) and task
parallelism (#pragma omp task), which we'll use to handle the recursive calls and thread
synchronization.

Key Points:
1. Thread Pooling and Queue Management: In the original code, you manage a thread
pool using a semaphore and a queue to manage recursive calls. In OpenMP, you don’t
need to manage a thread pool or use semaphores to signal threads explicitly; OpenMP
handles thread management and synchronization for you.
2. Recursive Parallelism: We will parallelize the recursive calls of QuickSort using
OpenMP’s tasking model (#pragma omp task).
3. Shared Variables: OpenMP will handle shared variables, so we need to ensure proper
data sharing or privatization.
4. Termination Conditions: We'll replace the semaphore and event signaling with a
condition that halts the recursion when sorting is done.

OpenMP program

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct {
int lo, hi;
} qSortIndex;

int A[] = {9, 7, 5, 11, 12, 2, 14, 3, 10, 6}; // Example array to sort
int N = sizeof(A) / sizeof(A[0]);

// Partition function for QuickSort (same as the original version)


int Partition(int lo, int hi) {
int pivot = A[hi];
int i = lo - 1;
for (int j = lo; j < hi; j++) {
if (A[j] < pivot) {
i++;
int temp = A[i];
A[i] = A[j];
A[j] = temp;
}
}
int temp = A[i + 1];
A[i + 1] = A[hi];
A[hi] = temp;
return i + 1;
}

// OpenMP QuickSort function


void QuickSort(qSortIndex *m) {
int p = m->lo;
int r = m->hi;

if (p < r) {
int q = Partition(p, r);

qSortIndex s1 = {p, q - 1};


qSortIndex s2 = {q + 1, r};

// Create parallel tasks for the two recursive calls


#pragma omp task shared(s1) if (r - p > 100) // Parallelize for larger ranges
QuickSort(&s1);

#pragma omp task shared(s2) if (r - p > 100)


QuickSort(&s2);

// Ensure that both tasks are finished before continuing


#pragma omp taskwait
}
}

int main() {
// Initialize the array for sorting
qSortIndex index = {0, N - 1};

// Set OpenMP threads (optional, can be omitted for automatic management)


omp_set_num_threads(4);

// Start QuickSort in parallel


#pragma omp parallel
{
#pragma omp single
QuickSort(&index);
}

// Print the sorted array


for (int i = 0; i < N; i++) {
printf("%d ", A[i]);
}
printf("\n");

return 0;
}
Explanation of Key Changes:

1. OpenMP Parallel Region:


o We use #pragma omp parallel to begin a parallel region and ensure that
multiple threads can be used to perform recursive calls.
o The #pragma omp single directive ensures that only one thread (the main
thread) starts the first call to QuickSort.
2. Task Parallelism:
o The recursive calls to QuickSort are parallelized using #pragma omp task. This
creates two tasks for the left and right sub-arrays. Each task runs in parallel when
the array size is sufficiently large (r - p > 100), but for small arrays, we avoid
task creation to reduce overhead.
3. #pragma omp taskwait:
o #pragma omp taskwait ensures that the parent task (the current instance of
QuickSort) waits for the completion of both recursive tasks before continuing.
This ensures that each partition is completely sorted before moving to the next
step.
4. Termination Condition:
o In the original Windows code, termination is triggered using a semaphore and
event signaling. In OpenMP, the recursion naturally terminates when the base
case is met (p >= r), so no explicit synchronization or signaling is necessary.
5. Shared and Private Variables:
o The qSortIndex structure is passed to the tasks using shared(s1) and
shared(s2) to make sure that the tasks can access the index of the array that
needs to be sorted. OpenMP automatically manages the sharing of these structures
across threads.
6. Control of Thread Count:
o You can control the number of threads used by OpenMP using
omp_set_num_threads(). However, OpenMP will automatically determine the
optimal number of threads based on the system's available resources, so this
setting is optional.

Performance Considerations:

 Recursive Task Overhead: When using OpenMP tasks for recursion, there is overhead
associated with creating tasks, especially for smaller sub-arrays. To mitigate this, we use
if (r - p > 100) to prevent parallelization of very small sub-arrays.
 Task Granularity: Adjusting the threshold if (r - p > 100) allows for better
performance tuning. In practice, you would choose a threshold that balances parallel
overhead and task granularity.

You might also like