Question bank Parallel computing
Module-1
1. Explain the four classifications of computer architectures according to Flynn’s Taxonomy
2. Outline the cache coherence in shared-memory systems.
3. Explain the interconnection networks in MIMD systems. What are the trade-offs between a
shared bus and a crossbar switch
4. Compare and contrast shared memory and distributed memory parallel systems .
5. Identify the condition under which a MIMD parallel program is considered scalable .
6. Significance of cache coherence (snooping and directory based)
7. Expain how do nyou handle non determinism in parallel systems.
7. Explain the Amdahl’s law with suitable example.
8. Expain the steps in parallel program design.
9. Expain the perforamce parameters of parallel system speedup, efficiency, scalability.
Module -2
1. Diifereniate UMA and NUMA multicore systems with diagram
2. Diffrence between shared memory and distributed memory parallel systems .
3. Explain the types of cache mappings.
4. Why is GPU programming considered heterogeneous programming, and what roles
do the CPU and GPU play in this model?
5. Given the following GPU code snippet, identify a performance issue and suggest how to
improve it:
if (rank_in_gp < 16)
my_x += 1;
else
my_x -= 1;
6. Comapre and contrast linear speedup and efficiency in GPU performance and CPU
performance
7. Analyze the tradeoff between speedup and efficiency in MIMD systems using the
suitable exmpile
8. Compare strong scalability and weak scalability in MIMD systems
9.
Module 3.
1. Explain the MPI_Send and MPI_Recv funcitions in distributed memory programming
2. Outline Strucure of a MPI program.
2.With pseudo code explain the steps involved in implementing numerical integration using the
Trapezoidal Rule in an MPI program.
3. Write an MPI Program to demonstrate Broadcast operation
4. Explain MPI_Reduce funcitions and its reduction operators in MPI
5. Outline MPI_Scatter and MPI_Gather functions and their parametrs.
6. Develop MPI matrix-vector multiplication function
7. Explain MPI_Barrier function and its usage in MPI programs.
8. Explain MPI_Ssend function and its usage in MPI programs.
9. Explain MPI_Ssend function and its usage in MPI programs
10. Write an MPI Program to demonstrate MPI_Send and MPI_Recv.
11. Write an MPI program to demonstrate deadlock using point-to-point communication and
avoidance of deadlock by altering the call sequence.
12. Write an MPI Program to demonstrate Broadcast operation.
13. Write an MPI Program to demonstrate MPI_Scatter and MPI_Gather.
14. Write an MPI Program to demonstrate MPI_Reduce and MPI_Allreduce (MPI_MAX,
MPI_MIN, MPI_SUM, MPI_PROD).
Module 4
1. Develop an OpenMP trapezoidal rule program.
2. Outline structure of OpenMP program.
3. Illustrate usage of The parallel for directive
4. Write an OpenMP program to sort an array on n elements using both sequential andparallel
mergesort (using Section). Record the difference in execution time.
5. Write an OpenMP program that divides the iterations into chunks containing 2 iterations,
respectively (OMP_SCHEDULE=static,2). Its input should be the number of iterations,
andits output should be which iterations of a parallelized for loop are executed by which
thread. Example: If there are two threads and four iterations, the output might be: a.
Thread 0: Iterations 0 — 1 b. Thread 1: Iterations 2 — 3
3 Write an OpenMP program to calculate n Fibonacci numbers using tasks.
Write an OpenMP program to find the prime numbers from 1 to n employing parallel for
4 directive. Record both serial and parallel execution times.
5. Explanation on reduction clause and schedule types
Module 5.
a. Block diagram of heterogeneous computing architecture
b. Develop kernal and main function of a CUDA programs that adds two vectors.
c. Explain Threads, blocks, and grids in CUDA programming with illustarion
d. Develop a CUDA kernel and wrapper function for implementing the trapezoidal
rule
e. Tree structured sum using wrap shuffle.