100% found this document useful (1 vote)

376 views3 pages

Answer: (C)

The document contains 6 multiple choice questions about processing images with CUDA kernels and calculating the number of warps that have control divergence. It provides the questions, possible answers, and explanations for each answer. The questions cover topics like block size, grid size, image dimensions, number of warps generated, and number of warps with control divergence for different kernel launch configurations and image sizes.

Uploaded by

amin minshaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

376 views3 pages

Answer: (C)

Uploaded by

amin minshaf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Quiz Questions

1. We are to process a 600X800 (800 pixels in the x or horizontal direction, 600 pixels in the y or
vertical direction) picture with the PictureKernel(). That is m’s value is 600 and n’s value is 800.

global void PictureKernel(float* d_Pin, float* d_Pout, int n, int m) {

// Calculate the row # of the d_Pin and d_Pout element to process
int Row = blockIdx.y*blockDim.y + threadIdx.y;
// Calculate the column # of the d_Pin and d_Pout element to process
int Col = blockIdx.x*blockDim.x + threadIdx.x;
// each thread computes one element of d_Pout if in range
if ((Row < m) && (Col < n)) {
d_Pout[Row*n+Col] = 2*d_Pin[Row*n+Col];
}
}

Assume that we decided to use a grid of 16X16 blocks. That is, each block is organized as a 2D
16X16 array of threads. How many warps will be generated during the execution of the kernel?

(A) 37*16
(B) 38*50
(C) 38*8*50
(D) 38*50*2

Answer: (C)
Explanation: There are ceil(800/16.0) = 50 blocks in the x direction and ceil(600/16.0) = 38
blocks in the y direction. Each block contributes (16*16)/32 = 8 warps. So there are 38*50*8
warps.

2. In Question 1, how many warps will have control divergence?

a. (A) 37 + 50*8
b. 38*16
c. 50
d. 0

Answer: (D)
Explanation: The size of the picture in the x dimension is a multiple of 16 so there is no block in
the x direction that has any threads in the invalid range. The size of the picture in the y
dimension is 37.5 times of 16. This means that the threads in the last block are divided into
halves: 128 in the valid range and 128 in the invalid range. Since 128 is a multiple of 32, all warps
will fall into either one or the other range. There is no control divergence.
3. In Question 1, if we are to process an 800x600 picture (600 pixels in the x or horizontal direction
and 800 pixels in the y or vertical direction) picture, how many warps will have control
divergence?

a. 37+50*8
b. 38*16
c. 50*8
d. 0

Answer: (C)
Explanation: The size of the picture in the x dimension is 600, which is 37.5 times of 16. This
means that every warp processing the right edge of the picture will have control divergence.
There are 50*8 such warps (50 blocks, 8 warps in each block). Since the size of the picture in the
y dimension is a multiple of 16, there is no more divergence in the warps that process the lower
edge of the picture.

4. In Question 1, if are to process a 799x600 picture (600 pixels in the x direction and 799 pixels in
the y direction), how many warps will have control divergence?

a. 37+50*8
b. (37+50)*8
c. 50*8
d. 0

Answer: (A)
Explanation: The number of warps processing the right edge remains 50*8, all of which will have
control divergence. However, the warps processing the lower edge of the picture will also have
control divergence. There are 38 of them. One of them is already counted for processing the
right edge. So we have 50*8+38-1 = 50*8+37.
5. Assume the following simple matrix multiplication kernel

global void MatrixMulKernel(float* M, float* N, float* P, int Width)

{
int Row = blockIdx.y*blockDim.y+threadIdx.y;
int Col = blockIdx.x*blockDim.x+threadIdx.x;
if ((Row < Width) && (Col < Width)) {
float Pvalue = 0;
for (int k = 0; k < Width; ++k) {Pvalue += M[Row*Width+k] * N[k*Width+Col];}
P[Row*Width+Col] = Pvalue;
}
}

If we launch the kernel with a block size of 16X16 on a 1000X1000 matrix, how many warps will have
control divergence?

a. 1,000
b. 500
c. 1,008
d. 508

Answer: (B)
Explanation: There will be 63 blocks in the horizontal direction. 8 threads in the x dimension in
each row will be in the invalid range. Every two rows form a warp. Therefore, there are 1000/2
=500 warps that will straddle the valid and invalid ranges in the horizontal direction. As for the
warps in the bottom blocks, there are 8 warps in the valid range and 8 warps in the invalid
range. Threads in these warps are either totally in the valid range or invalid range.

6. If a CUDA device’s SM (streaming multiprocessor) can take up to 1,536 threads and up to 8

thread blocks. Which of the following block configuration would result in the most number of
threads in each SM?

a. 64 threads per block

b. 128 threads per block
c. 512 threads per block
d. 1,024 threads per block

Answer: (C)
Explanation: (A) and (B) are limited by the number of thread blocks that can be accommodated
by each SM. (D) is not a divider of 1,536, leaving 1/3 of the thread space open. (C) results in 3
blocks and fully occupies the capacity of 1,536 threads in each SM.

Coursera Week 2 CUDA Quiz Questions
No ratings yet
Coursera Week 2 CUDA Quiz Questions
3 pages
Processors
No ratings yet
Processors
25 pages
ECE408 S19 ZJUI Exam1 Study Guide
No ratings yet
ECE408 S19 ZJUI Exam1 Study Guide
25 pages
ECE408 2012 Practice Exam1
No ratings yet
ECE408 2012 Practice Exam1
10 pages
CUDA Programming Quiz
100% (5)
CUDA Programming Quiz
4 pages
6 Computation
No ratings yet
6 Computation
11 pages
GPU Kernel & Memory Quiz
100% (1)
GPU Kernel & Memory Quiz
3 pages
Performance
No ratings yet
Performance
51 pages
BCS3413 Principle & Applications of Parallel Programming Quiz 2: Gpgpu Cuda
No ratings yet
BCS3413 Principle & Applications of Parallel Programming Quiz 2: Gpgpu Cuda
3 pages
周04
No ratings yet
周04
46 pages
CUDA Thread Divergence Costs
No ratings yet
CUDA Thread Divergence Costs
8 pages
CUDA Programming Exam Solutions
No ratings yet
CUDA Programming Exam Solutions
11 pages
Developing Kernels: Part 2: Algorithm Considerations, Multi-Kernel Programs and Optimization
No ratings yet
Developing Kernels: Part 2: Algorithm Considerations, Multi-Kernel Programs and Optimization
23 pages
CUDA NPTEL WEEK 8 Assignment
No ratings yet
CUDA NPTEL WEEK 8 Assignment
3 pages
CUDA Part-2
No ratings yet
CUDA Part-2
49 pages
5 Computation
No ratings yet
5 Computation
13 pages
CUDA Programming Assignment Week 3
No ratings yet
CUDA Programming Assignment Week 3
4 pages
Chapter 3
No ratings yet
Chapter 3
20 pages
HSCD FewSmall CaseStudy
No ratings yet
HSCD FewSmall CaseStudy
19 pages
3 Cuda
No ratings yet
3 Cuda
5 pages
Advance CSO Sloution 2022
No ratings yet
Advance CSO Sloution 2022
135 pages
CUDA Error Handling and Thread Organization
No ratings yet
CUDA Error Handling and Thread Organization
39 pages
Module 3 Quiz
No ratings yet
Module 3 Quiz
2 pages
Efficient GPU Programming Principles
No ratings yet
Efficient GPU Programming Principles
46 pages
CS061 Sample Final
No ratings yet
CS061 Sample Final
16 pages
Hpca2021 Gpu 1
No ratings yet
Hpca2021 Gpu 1
45 pages
Assignment Solution Week8
No ratings yet
Assignment Solution Week8
3 pages
217 Lec3
No ratings yet
217 Lec3
46 pages
Stanford CS149 Parallel Computing Assignment
No ratings yet
Stanford CS149 Parallel Computing Assignment
31 pages
Technical
No ratings yet
Technical
20 pages
Microcontroller Sample Questions CCEE
100% (3)
Microcontroller Sample Questions CCEE
35 pages
Written Asst2
No ratings yet
Written Asst2
27 pages
CUDA Take-Home Exam Solutions
No ratings yet
CUDA Take-Home Exam Solutions
11 pages
Understanding Granularity in Parallel Computing
No ratings yet
Understanding Granularity in Parallel Computing
24 pages
COMPUTER ARCHITECTURE Exam Correction
No ratings yet
COMPUTER ARCHITECTURE Exam Correction
8 pages
5 - Clipping Algorithms
No ratings yet
5 - Clipping Algorithms
7 pages
CUDA Memory
No ratings yet
CUDA Memory
56 pages
Opencl Programming For The Cuda Architecture
No ratings yet
Opencl Programming For The Cuda Architecture
23 pages
GPU Programming Slides 3
No ratings yet
GPU Programming Slides 3
73 pages
CENG443 2023 Final
No ratings yet
CENG443 2023 Final
4 pages
Ass2 cs637 Merged Organized
No ratings yet
Ass2 cs637 Merged Organized
18 pages
ITC411 Assessment: Computer Architecture
No ratings yet
ITC411 Assessment: Computer Architecture
6 pages
6963 Midterm Review
No ratings yet
6963 Midterm Review
20 pages
UCS505 Quiz 23 - 24
No ratings yet
UCS505 Quiz 23 - 24
3 pages
CUDA Class Lecture02
No ratings yet
CUDA Class Lecture02
24 pages
Week 11 Revision With Answers v2
No ratings yet
Week 11 Revision With Answers v2
13 pages
Computer Architecture Ph.D. Exam Questions
No ratings yet
Computer Architecture Ph.D. Exam Questions
2 pages
CNN Assignment 3 Solutions
No ratings yet
CNN Assignment 3 Solutions
8 pages
Assinmet&Case Study
No ratings yet
Assinmet&Case Study
19 pages
AI and ML: Key Concepts Explained
No ratings yet
AI and ML: Key Concepts Explained
9 pages
Case Study On GPU Architectures: Lecture 3H
No ratings yet
Case Study On GPU Architectures: Lecture 3H
34 pages
Cos3721 - 102 - 2025 - NV
No ratings yet
Cos3721 - 102 - 2025 - NV
6 pages
Efficient GPU Resource Sharing Strategies
No ratings yet
Efficient GPU Resource Sharing Strategies
22 pages
Advanced Computer Architecture Quiz Solutions
No ratings yet
Advanced Computer Architecture Quiz Solutions
5 pages
Summary Exam 2015
No ratings yet
Summary Exam 2015
30 pages
CUDA Matrix Multiplication Quiz
No ratings yet
CUDA Matrix Multiplication Quiz
12 pages
ARM MCQs
No ratings yet
ARM MCQs
16 pages
Final 2011
No ratings yet
Final 2011
6 pages
Geometric Sequence & Series - KEY
No ratings yet
Geometric Sequence & Series - KEY
4 pages
Instruction-Level Parallelism 2
No ratings yet
Instruction-Level Parallelism 2
77 pages
AI Cybersecurity Proposal With Flowcharts
No ratings yet
AI Cybersecurity Proposal With Flowcharts
12 pages
Unofficial Document
No ratings yet
Unofficial Document
59 pages
Security Breach: The Case of TJX Companies, Inc
No ratings yet
Security Breach: The Case of TJX Companies, Inc
18 pages
Copyright Info
No ratings yet
Copyright Info
1 page
MT3367 Android Scatter
No ratings yet
MT3367 Android Scatter
8 pages
Functional vs. Divisional Structures
No ratings yet
Functional vs. Divisional Structures
4 pages
Monthly Budget Algorithm Design and Debugging Techniques
No ratings yet
Monthly Budget Algorithm Design and Debugging Techniques
5 pages
Swissbit WORM SD Card
No ratings yet
Swissbit WORM SD Card
48 pages
OpenSAP Sac1 Week 2 Transcript
100% (1)
OpenSAP Sac1 Week 2 Transcript
31 pages
MCA With AWS DevOps Brochure
No ratings yet
MCA With AWS DevOps Brochure
21 pages
Class 9 Digital Presenation
No ratings yet
Class 9 Digital Presenation
3 pages
Course File IoT Applications 4 1 IoT 20CS41006
No ratings yet
Course File IoT Applications 4 1 IoT 20CS41006
87 pages
FS Traffic Manual
No ratings yet
FS Traffic Manual
43 pages
DDA Unit 1
No ratings yet
DDA Unit 1
35 pages
MODULE-4 RM Vipul 2
No ratings yet
MODULE-4 RM Vipul 2
18 pages
JumboRemoteManual 538744 PDF
No ratings yet
JumboRemoteManual 538744 PDF
33 pages
Quick Tour: Accurate Tracking Easy Installation
No ratings yet
Quick Tour: Accurate Tracking Easy Installation
7 pages
A Guide To Using Plank Dock On Linux
No ratings yet
A Guide To Using Plank Dock On Linux
1 page
Extra Joss Website Development Plan
100% (3)
Extra Joss Website Development Plan
6 pages
BI Tools and CA PPM Introduction To PowerBI and Tableau
No ratings yet
BI Tools and CA PPM Introduction To PowerBI and Tableau
64 pages
Data Can Be Classified As Qualitative or Quantitative.: Recall From Yesterday
No ratings yet
Data Can Be Classified As Qualitative or Quantitative.: Recall From Yesterday
5 pages
WWW Tradingview Co...
No ratings yet
WWW Tradingview Co...
26 pages
Bis613d MQP Solved
No ratings yet
Bis613d MQP Solved
41 pages
Kidus Yared
No ratings yet
Kidus Yared
5 pages
Fake Instagram Post Generator Generate Fake Instagram Post
No ratings yet
Fake Instagram Post Generator Generate Fake Instagram Post
1 page
BCS English Preparation Fill in The Appropriate Preposition PDF
No ratings yet
BCS English Preparation Fill in The Appropriate Preposition PDF
7 pages
Bug 34199863 Product Fix Design
No ratings yet
Bug 34199863 Product Fix Design
6 pages
SEPM Chapter 1 (Module 3)
No ratings yet
SEPM Chapter 1 (Module 3)
16 pages

Answer: (C)

Uploaded by

Answer: (C)

Uploaded by

Quiz Questions

__global__ void PictureKernel(float* d_Pin, float* d_Pout, int n, int m) {

2. In Question 1, how many warps will have control divergence?

__global__ void MatrixMulKernel(float* M, float* N, float* P, int Width)

6. If a CUDA device’s SM (streaming multiprocessor) can take up to 1,536 threads and up to 8

a. 64 threads per block

You might also like

global void PictureKernel(float* d_Pin, float* d_Pout, int n, int m) {

global void MatrixMulKernel(float* M, float* N, float* P, int Width)