0% found this document useful (0 votes)

16 views7 pages

Introduction To CUDA Programming

The document provides an overview of CUDA programming, which is an extension of C/C++ developed by Nvidia for parallel computing using GPUs. It discusses the architecture, execution model, and applications of CUDA, highlighting its benefits such as significant speed-ups in processing tasks. Additionally, it outlines the limitations of CUDA, including its compatibility with only NVIDIA hardware and interoperability issues with other languages like OpenGL.

Uploaded by

xilvenkat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views7 pages

Introduction To CUDA Programming

Uploaded by

xilvenkat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Search... Sign In

A Practice Problems C C++ Java Python JavaScript Data Science Machine Learning Cour

Introduction to CUDA Programming

Last Updated : 23 Jul, 2025

In this article, we will cover the overview of CUDA programming and

mainly focus on the concept of CUDA requirement and we will also
discuss the execution model of CUDA. Finally, we will see the
application. Let us discuss it one by one.

CUDA stands for Compute Unified Device Architecture. It is an extension

of C/C++ programming. CUDA is a programming language that uses the
Graphical Processing Unit (GPU). It is a parallel computing platform and
an API (Application Programming Interface) model, Compute Unified
Device Architecture was developed by Nvidia. This allows computations
to be performed in parallel while providing well-formed speed. Using
CUDA, one can harness the power of the Nvidia GPU to perform common
computing tasks, such as processing matrices and other linear algebra
operations, rather than simply performing graphical calculations.

Why do we need CUDA?

GPUs are designed to perform high-speed parallel computations to

display graphics such as games.
Use available CUDA resources. More than 100 million GPUs are
already deployed.
It provides 30-100x speed-up over other microprocessors for some
applications.
GPUs have very small Arithmetic Logic Units (ALUs) compared to the
somewhat larger CPUs. This allows for many parallel calculations,
such as calculating the color for each pixel on the screen, etc.

https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 1/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Architecture of CUDA

16 Streaming Multiprocessor (SM) diagrams are shown in the above

diagram.
Each Streaming Multiprocessor has 8 Streaming Processors (SP) ie,
we get a total of 128 Streaming Processors (SPs).
Now, each Streaming processor has a MAD unit (Multiplication and
Addition Unit) and an additional MU (multiplication unit).
The GT200 has 30 Streaming Multiprocessors (SMs) and each
Streaming Multiprocessor (SM) has 8 Streaming Processors (SPs) ie, a
total of 240 Streaming Processors (SPs), and more than 1 TFLOP
processing power.
Each Streaming Processor is gracefully threaded and can run
thousands of threads per application.
The G80 card has 16 Streaming Multiprocessors (SMs) and each SM
has 8 Streaming Processors (SPs), i.e., a total of 128 SPs and it
supports 768 threads per Streaming Multiprocessor (note: not per SP).
Eventually, after each Streaming Multiprocessor has 8 SPs, each SP
supports a maximal of 768/8 = 96 threads. Total threads that can run
on 128 SPs - 128 * 96 = 12,228 times.
Therefore these processors are called massively parallel.
The G80 chips have a memory bandwidth of 86.4GB/s.

https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 2/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

It also has an 8GB/s communication channel with the CPU (4GB/s for
uploading to the CPU RAM, and 4GB/s for downloading from the CPU
RAM).

How CUDA works?

GPUs run one kernel (a group of tasks) at a time.

Each kernel consists of blocks, which are independent groups of ALUs.
Each block contains threads, which are levels of computation.
The threads in each block typically work together to calculate a value.
Threads in the same block can share memory.
In CUDA, sending information from the CPU to the GPU is often the
most typical part of the computation.
For each thread, local memory is the fastest, followed by shared
memory, global, static, and texture memory the slowest.

Typical CUDA Program flow

1. Load data into CPU memory

2. Copy data from CPU to GPU memory - e.g., cudaMemcpy(...,
cudaMemcpyHostToDevice)
3. Call GPU kernel using device variable - e.g., kernel<<<>>> (gpuVar)
4. Copy results from GPU to CPU memory - e.g., cudaMemcpy(..,
cudaMemcpyDeviceToHost)
5. Use results on CPU

How work is distributed?

Each thread "knows" the x and y coordinates of the block it is in, and
the coordinates where it is in the block.
These positions can be used to calculate a unique thread ID for each
thread.
The computational work done will depend on the value of the thread
ID.
https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 3/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

For example, the thread ID corresponds to a group of matrix elements.

CUDA Applications

CUDA applications must run parallel operations on a lot of data, and be

processing-intensive.

1. Computational finance
2. Climate, weather, and ocean modeling
3. Data science and analytics
4. Deep learning and machine learning
5. Defence and intelligence
6. Manufacturing/AEC
7. Media and entertainment
8. Medical imaging
9. Oil and gas
10. Research
11. Safety and security
12. Tools and management

Benefits of CUDA

There are several advantages that give CUDA an edge over traditional
general-purpose graphics processor (GPU) computers with graphics
APIs:

Integrated memory (CUDA 6.0 or later) and Integrated virtual memory

(CUDA 4.0 or later).
Shared memory provides a fast area of shared memory for CUDA
threads. It can be used as a caching mechanism and provides more
bandwidth than texture lookup.
Scattered read codes can be read from any address in memory.
Improved performance on downloads and reads, which works well
from the GPU and to the GPU.
CUDA has full support for bitwise and integer operations.
https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 4/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Limitations of CUDA

CUDA source code is given on the host machine or GPU, as defined by

the C++ syntax rules. Longstanding versions of CUDA use C syntax
rules, which means that up-to-date CUDA source code may or may
not work as required.
CUDA has unilateral interoperability(the ability of computer systems
or software to exchange and make use of information) with transferor
languages like OpenGL. OpenGL can access CUDA registered
memory, but CUDA cannot access OpenGL memory.
Afterward versions of CUDA do not provide emulators or fallback
support for older versions.
CUDA supports only NVIDIA hardware.

Comment A amitve… Follow 20

Article Tags : Electronics Engineering TrueGeek-2021

Explore
Electronic Devices & Components

Digital Circuits & Logic

Analog & Circuit Behavior

Solid-State Devices

Communication Systems

Signal Processing

https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 5/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Corporate & Communications Address:

A-143, 7th Floor, Sovereign Corporate
Tower, Sector- 136, Noida, Uttar Pradesh
(201305)

Registered Address:
K 061, Tower K, Gulshan Vivante
Apartment, Sector 137, Noida, Gautam
Buddh Nagar, Uttar Pradesh, 201305

https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 6/7
11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Company Explore
About Us POTD
Legal Job-A-Thon
Privacy Policy Blogs
Contact Us Nation Skill Up
Advertise with us
GFG Corporate Solution
Campus Training Program

Tutorials Courses
Programming Languages IBM Certification
DSA DSA and Placements
Web Technology Web Development
AI, ML & Data Science Programming Languages
DevOps DevOps & Cloud
CS Core Subjects GATE
Interview Preparation Trending Technologies
Software and Tools

Videos Preparation Corner

DSA Interview Corner
Python Aptitude
Java Puzzles
C++ GfG 160
Web Development System Design
Data Science
CS Subjects

https://www.geeksforgeeks.org/electronics-engineering/introduction-to-cuda-programming/ 7/7

Introduction To CUDA Programming-1
No ratings yet
Introduction To CUDA Programming-1
5 pages
1 Cuda
100% (1)
1 Cuda
173 pages
ECE 498AL The CUDA Programming Model
No ratings yet
ECE 498AL The CUDA Programming Model
37 pages
Lecture 2
No ratings yet
Lecture 2
77 pages
CUDA
No ratings yet
CUDA
18 pages
Cuuda Nvidai Guide - Part1
No ratings yet
Cuuda Nvidai Guide - Part1
15 pages
CUDA Programming Model Overview
No ratings yet
CUDA Programming Model Overview
31 pages
CUDA Programming Overview and Guide
No ratings yet
CUDA Programming Overview and Guide
28 pages
GPU Architecture Ebook
No ratings yet
GPU Architecture Ebook
67 pages
Introduction to CUDA Programming
No ratings yet
Introduction to CUDA Programming
26 pages
Cuda PPT
No ratings yet
Cuda PPT
54 pages
CUDA for Developers and Engineers
No ratings yet
CUDA for Developers and Engineers
28 pages
CUDA Programming for Engineers
No ratings yet
CUDA Programming for Engineers
84 pages
Programming Gpus With Cuda: John Mellor-Crummey
No ratings yet
Programming Gpus With Cuda: John Mellor-Crummey
42 pages
Parallel & Distributed Computing Report
No ratings yet
Parallel & Distributed Computing Report
4 pages
Introduction to CUDA Parallel Programming
No ratings yet
Introduction to CUDA Parallel Programming
25 pages
DS1822 - Parallel Computing-Unit3
No ratings yet
DS1822 - Parallel Computing-Unit3
17 pages
Programming Models For GPU Architecture
No ratings yet
Programming Models For GPU Architecture
55 pages
Chapter7 GPU
No ratings yet
Chapter7 GPU
45 pages
GPU Basics
No ratings yet
GPU Basics
93 pages
Cuda
No ratings yet
Cuda
25 pages
CUDA 1 - Introduction To GPU, CUDA
No ratings yet
CUDA 1 - Introduction To GPU, CUDA
21 pages
High Performance Computing On Gpu
No ratings yet
High Performance Computing On Gpu
37 pages
CUDA Programming Overview
No ratings yet
CUDA Programming Overview
38 pages
Topic GPU1
No ratings yet
Topic GPU1
32 pages
GPU Programming Slides 2
No ratings yet
GPU Programming Slides 2
37 pages
Introduction - CUDA C Programming Guide
No ratings yet
Introduction - CUDA C Programming Guide
573 pages
CSE Lec4 Cuda
No ratings yet
CSE Lec4 Cuda
91 pages
CUDA Programming: Advantages & Limitations
No ratings yet
CUDA Programming: Advantages & Limitations
35 pages
Gpu Computing
No ratings yet
Gpu Computing
57 pages
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
No ratings yet
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
29 pages
Introduction To Gpu Programming With Cuda and Openacc
100% (1)
Introduction To Gpu Programming With Cuda and Openacc
40 pages
CUDA Programming On Nvidia Gpus: Mike Giles
No ratings yet
CUDA Programming On Nvidia Gpus: Mike Giles
21 pages
0 Gpu Computing I Give It
No ratings yet
0 Gpu Computing I Give It
57 pages
Cuda Review 1
No ratings yet
Cuda Review 1
13 pages
CUDA Programming for Engineers
No ratings yet
CUDA Programming for Engineers
17 pages
HPC Final 4-8
No ratings yet
HPC Final 4-8
25 pages
Intro to CUDA Programming Guide
No ratings yet
Intro to CUDA Programming Guide
33 pages
Cuda Final
No ratings yet
Cuda Final
17 pages
Lec 1
No ratings yet
Lec 1
27 pages
Introduction to CUDA Programming Basics
No ratings yet
Introduction to CUDA Programming Basics
247 pages
Lecture 1: An Introduction To CUDA: Mike Giles
No ratings yet
Lecture 1: An Introduction To CUDA: Mike Giles
40 pages
GPU & CUDA Programming Guide
No ratings yet
GPU & CUDA Programming Guide
31 pages
Introduction to CUDA Programming Basics
No ratings yet
Introduction to CUDA Programming Basics
15 pages
Unit 4
100% (1)
Unit 4
48 pages
CUDA Class Lecture01
No ratings yet
CUDA Class Lecture01
26 pages
A Beginner'S Guide To Programming Gpus With Cuda: Mike Peardon
No ratings yet
A Beginner'S Guide To Programming Gpus With Cuda: Mike Peardon
21 pages
Cuda 1
No ratings yet
Cuda 1
45 pages
Course 7
No ratings yet
Course 7
21 pages
From CPU To GPU With CUDA C Language: Michele Tuttafesta Dottorato Di Ricerca in Fisica 25 Ciclo
No ratings yet
From CPU To GPU With CUDA C Language: Michele Tuttafesta Dottorato Di Ricerca in Fisica 25 Ciclo
71 pages
Chapter 8
No ratings yet
Chapter 8
58 pages
002 - Introduction To CUDA Programming - 1
No ratings yet
002 - Introduction To CUDA Programming - 1
54 pages
Lecture3 Fundamentals of CUDA (Part1) - 2025
No ratings yet
Lecture3 Fundamentals of CUDA (Part1) - 2025
52 pages
Lecture 12 GPU Programming
No ratings yet
Lecture 12 GPU Programming
65 pages
Understanding PGPU and CUDA Basics
No ratings yet
Understanding PGPU and CUDA Basics
70 pages
Endsem Imp HPC Unit 5
No ratings yet
Endsem Imp HPC Unit 5
24 pages
Newscast CW27
No ratings yet
Newscast CW27
1 page
Sri Suktam - Devanagari - Vaidika Vignanam
No ratings yet
Sri Suktam - Devanagari - Vaidika Vignanam
5 pages
C++ Tips and Tricks (Bruce Merry)
No ratings yet
C++ Tips and Tricks (Bruce Merry)
51 pages
ESL Modeling for SoC Design
No ratings yet
ESL Modeling for SoC Design
45 pages
cs179 2024 Lec01
No ratings yet
cs179 2024 Lec01
26 pages
P150SM Esm
No ratings yet
P150SM Esm
126 pages
GC Vpro How To WP
No ratings yet
GC Vpro How To WP
11 pages
A Comparative Study On Recent Mobile Phone Processors
No ratings yet
A Comparative Study On Recent Mobile Phone Processors
6 pages
S62256 - Demystify CUDA Debugging and Performance With Powerful Developer Tools
No ratings yet
S62256 - Demystify CUDA Debugging and Performance With Powerful Developer Tools
44 pages
MAXPROÂ® NVRs Family Data Sheet
No ratings yet
MAXPROÂ® NVRs Family Data Sheet
4 pages
Python Quiz, Mad Libs & Speed Test
No ratings yet
Python Quiz, Mad Libs & Speed Test
8 pages
lastUIException 63840838323
No ratings yet
lastUIException 63840838323
6 pages
Energy and Policy Considerations For Deep Learning in NLP: Emma Strubell Ananya Ganesh Andrew Mccallum
No ratings yet
Energy and Policy Considerations For Deep Learning in NLP: Emma Strubell Ananya Ganesh Andrew Mccallum
6 pages
Fbinf 03 1229052
No ratings yet
Fbinf 03 1229052
6 pages
How To Play Initial D Arcade Stage 4-8 On PC W/ TeknoParrot
100% (2)
How To Play Initial D Arcade Stage 4-8 On PC W/ TeknoParrot
16 pages
Virtual Bus Transport Simulation Project
86% (7)
Virtual Bus Transport Simulation Project
40 pages
Gigabyte B450 AORUS M Performance Review
No ratings yet
Gigabyte B450 AORUS M Performance Review
6 pages
HIVE Blockchain Short Report
No ratings yet
HIVE Blockchain Short Report
7 pages
Hashcat User Manual
No ratings yet
Hashcat User Manual
34 pages
Unit Cgo2021
No ratings yet
Unit Cgo2021
13 pages
3.2 The A100 Datacenter GPU and Ampere Architecture
No ratings yet
3.2 The A100 Datacenter GPU and Ampere Architecture
3 pages
Aetina M3A4500-WP Datasheet
No ratings yet
Aetina M3A4500-WP Datasheet
2 pages
Sys 821ge TNHR
No ratings yet
Sys 821ge TNHR
6 pages
Work-From-Home Equipment Guidelines
No ratings yet
Work-From-Home Equipment Guidelines
3 pages
Audio Transcript Config
No ratings yet
Audio Transcript Config
4 pages
Altos R480 F4 - v2.0
No ratings yet
Altos R480 F4 - v2.0
2 pages
The District Cooperative Central Bank LTD, ELURU
No ratings yet
The District Cooperative Central Bank LTD, ELURU
13 pages
Opencl 2pp
No ratings yet
Opencl 2pp
28 pages
Xbox 360 Repair Guide for ROD Issues
100% (1)
Xbox 360 Repair Guide for ROD Issues
113 pages
Nvidia Geforce GTX 660 Ti Graphics Card Rs.22900
No ratings yet
Nvidia Geforce GTX 660 Ti Graphics Card Rs.22900
5 pages
Valorant
No ratings yet
Valorant
2 pages
Deep Learning Image Super-Resolution Project
No ratings yet
Deep Learning Image Super-Resolution Project
24 pages
Python Programming and SQL 5 Books in 1 From Starter To Smarter Master Hands On Coding Break Career Barriers and Unlock Expert Techniques With A Step by Step Method
No ratings yet
Python Programming and SQL 5 Books in 1 From Starter To Smarter Master Hands On Coding Break Career Barriers and Unlock Expert Techniques With A Step by Step Method
210 pages
Recurrent and Recursive Neural Networks
No ratings yet
Recurrent and Recursive Neural Networks
19 pages

Introduction To CUDA Programming

Uploaded by

Introduction To CUDA Programming

Uploaded by

11/3/25, 5:59 PM Introduction to CUDA Programming - GeeksforGeeks

Introduction to CUDA Programming

In this article, we will cover the overview of CUDA programming and

CUDA stands for Compute Unified Device Architecture. It is an extension

Why do we need CUDA?

GPUs are designed to perform high-speed parallel computations to

16 Streaming Multiprocessor (SM) diagrams are shown in the above

How CUDA works?

GPUs run one kernel (a group of tasks) at a time.

Typical CUDA Program flow

1. Load data into CPU memory

How work is distributed?

For example, the thread ID corresponds to a group of matrix elements.

CUDA applications must run parallel operations on a lot of data, and be

Integrated memory (CUDA 6.0 or later) and Integrated virtual memory

CUDA source code is given on the host machine or GPU, as defined by

Comment A amitve… Follow 20

Article Tags : Electronics Engineering TrueGeek-2021

Digital Circuits & Logic

Analog & Circuit Behavior

Corporate & Communications Address:

Videos Preparation Corner

@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved

You might also like