CUDA-Thread-Indexing-Cheatsheet

Uploaded by

mamalee393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views3 pages

CUDA-Thread-Indexing-Cheatsheet

Uploaded by

mamalee393

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

CUDA Thread Indexing Cheatsheet

If you are a CUDA parallel programmer but sometimes you cannot wrap your
head around thread indexing just like me then you are at the right place.
Many problems are naturally described in a flat, linear style mimicking our mental
model of C’s memory layout. However, other tasks, especially those encountered
in the computational sciences, are naturally embedded in two or three
dimensions. For example, image processing tasks typically impose a regular 2D
raster over the problem domain while computational fluid dynamics might be
most naturally expressed by partitioning a volume over 3D grid.

NVIDIA CUDA Thread Model

Sometimes it can be a bit tricky to figure out the global (unique) thread index,
especially if you are working with multi-dimensional grids of multi-dimensional
blocks of threads. I could not really find a simple cheat-sheet that would
demonstrate what exactly you need to do to calculate a global thread index for
every configuration you might need to use. I know that with a little effort anyone
can figure it out but I thought I would share some of my code with you to make
your life easier. At the end of the day, sharing is caring :)
Download example code, which you can compile with nvcc simpleIndexing.cu -o
simpleIndexing -arch=sm_20

1D grid of 1D blocks
__device__
int getGlobalIdx_1D_1D(){
return blockIdx.x *blockDim.x + threadIdx.x;
}

1D grid of 2D blocks
__device__
int getGlobalIdx_1D_2D(){
return blockIdx.x * blockDim.x * blockDim.y
+ threadIdx.y * blockDim.x + threadIdx.x;
}

1D grid of 3D blocks
__device__
int getGlobalIdx_1D_3D(){
return blockIdx.x * blockDim.x * blockDim.y * blockDim.z
+ threadIdx.z * blockDim.y * blockDim.x
+ threadIdx.y * blockDim.x + threadIdx.x;
}

2D grid of 1D blocks
__device__ int getGlobalIdx_2D_1D(){
int blockId = blockIdx.y * gridDim.x + blockIdx.x;
int threadId = blockId * blockDim.x + threadIdx.x;
return threadId;
}

2D grid of 2D blocks
__device__
int getGlobalIdx_2D_2D(){
int blockId = blockIdx.x + blockIdx.y * gridDim.x;
int threadId = blockId * (blockDim.x * blockDim.y)
+ (threadIdx.y * blockDim.x) + threadIdx.x;
return threadId;
}
2D grid of 3D blocks
__device__
int getGlobalIdx_2D_3D(){
int blockId = blockIdx.x + blockIdx.y * gridDim.x;
int threadId = blockId * (blockDim.x * blockDim.y * blockDim.z)
+ (threadIdx.z * (blockDim.x * blockDim.y))
+ (threadIdx.y * blockDim.x) + threadIdx.x;
return threadId;
}

3D grid of 1D blocks
__device__
int getGlobalIdx_3D_1D(){
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * blockDim.x + threadIdx.x;
return threadId;
}

3D grid of 2D blocks
__device__
int getGlobalIdx_3D_2D(){
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * (blockDim.x * blockDim.y)
+ (threadIdx.y * blockDim.x) + threadIdx.x;
return threadId;
}

3D grid of 3D blocks
__device__
int getGlobalIdx_3D_3D(){
int blockId = blockIdx.x + blockIdx.y * gridDim.x
+ gridDim.x * gridDim.y * blockIdx.z;
int threadId = blockId * (blockDim.x * blockDim.y * blockDim.z)
+ (threadIdx.z * (blockDim.x * blockDim.y))
+ (threadIdx.y * blockDim.x) + threadIdx.x;
return threadId;
}

http://www.martinpeniak.com/index.php?option=com_content&view=article&catid=17
:updates&id=288:cuda-‐thread-‐indexing-‐explained

Datawarehouse Design Problems and Solutions
100% (3)
Datawarehouse Design Problems and Solutions
22 pages
Computer System Servicing Grade 12
75% (4)
Computer System Servicing Grade 12
14 pages
CSC447 Multidimensional Grids and Data
No ratings yet
CSC447 Multidimensional Grids and Data
65 pages
Class 10
No ratings yet
Class 10
13 pages
GPU_Programming_slides_3
No ratings yet
GPU_Programming_slides_3
73 pages
Chapter 3 Multidimensional Grids a 2023 Programming Massively Parallel Pro
No ratings yet
Chapter 3 Multidimensional Grids a 2023 Programming Massively Parallel Pro
22 pages
Matrix-Matrix Multiplication Using Shared Memory
No ratings yet
Matrix-Matrix Multiplication Using Shared Memory
27 pages
3 Some Commonly Used CUDA API: 3.1 Function Type Qualifiers
No ratings yet
3 Some Commonly Used CUDA API: 3.1 Function Type Qualifiers
7 pages
002 - Introduction To CUDA Programming - 1
No ratings yet
002 - Introduction To CUDA Programming - 1
54 pages
GPU Computing 2
No ratings yet
GPU Computing 2
28 pages
Cuda
No ratings yet
Cuda
7 pages
HPC
No ratings yet
HPC
90 pages
Chapter 3
No ratings yet
Chapter 3
20 pages
Graphics Processing Unit (GPU) Architecture and Programming: TU/e 5kk73 Zhenyu Ye Henk Corporaal 2011-11-15
No ratings yet
Graphics Processing Unit (GPU) Architecture and Programming: TU/e 5kk73 Zhenyu Ye Henk Corporaal 2011-11-15
53 pages
UNIT-5 Tiling
No ratings yet
UNIT-5 Tiling
23 pages
Cuda 101
No ratings yet
Cuda 101
53 pages
tilining
No ratings yet
tilining
23 pages
02 CUDA Shared Memory
No ratings yet
02 CUDA Shared Memory
21 pages
CUDA Putting It All Together
No ratings yet
CUDA Putting It All Together
39 pages
GPU Programming: CUDA
No ratings yet
GPU Programming: CUDA
29 pages
cuuda nvidai guide_Part3
No ratings yet
cuuda nvidai guide_Part3
15 pages
5-computation
No ratings yet
5-computation
13 pages
Introduction To CUDA: CAP 4730 Spring 2012
No ratings yet
Introduction To CUDA: CAP 4730 Spring 2012
35 pages
Cuda Firstprograms PDF
No ratings yet
Cuda Firstprograms PDF
6 pages
XNA Isometric Games by Martin Actor
No ratings yet
XNA Isometric Games by Martin Actor
13 pages
7. Moving to Parallel - Addition of 2 Matrices
No ratings yet
7. Moving to Parallel - Addition of 2 Matrices
14 pages
Prak Grafika Komputer Minggu 10
No ratings yet
Prak Grafika Komputer Minggu 10
9 pages
217 Lec3
No ratings yet
217 Lec3
46 pages
Multithreaded Architectures: Memory and Data Locality
No ratings yet
Multithreaded Architectures: Memory and Data Locality
39 pages
ECE408 S19 ZJUI Exam1 Study Guide
No ratings yet
ECE408 S19 ZJUI Exam1 Study Guide
25 pages
UNIT-5
No ratings yet
UNIT-5
90 pages
Developing Kernels: Part 2: Algorithm Considerations, Multi-Kernel Programs and Optimization
No ratings yet
Developing Kernels: Part 2: Algorithm Considerations, Multi-Kernel Programs and Optimization
23 pages
3d Transformation
No ratings yet
3d Transformation
4 pages
Matrix Mult
100% (1)
Matrix Mult
55 pages
3-CUDA
No ratings yet
3-CUDA
5 pages
217 Lec6
No ratings yet
217 Lec6
23 pages
2023 CSC14120 Lecture05 CUDAMemories
No ratings yet
2023 CSC14120 Lecture05 CUDAMemories
48 pages
406 cg10
No ratings yet
406 cg10
2 pages
Mesh Tutorial - Codea
No ratings yet
Mesh Tutorial - Codea
11 pages
A41101 - How CUDA Programming Works
No ratings yet
A41101 - How CUDA Programming Works
116 pages
CUDA Introduction
No ratings yet
CUDA Introduction
39 pages
Class4 Advanced Cuda Opencl
No ratings yet
Class4 Advanced Cuda Opencl
64 pages
sc09 Fluid Sim Cohen
No ratings yet
sc09 Fluid Sim Cohen
33 pages
Cuda Notes From Udacity Lecture
No ratings yet
Cuda Notes From Udacity Lecture
3 pages
CUDA_Memory
No ratings yet
CUDA_Memory
56 pages
GPU Computing With CUDA Lecture 3 - Efficient Shared Memory Use
No ratings yet
GPU Computing With CUDA Lecture 3 - Efficient Shared Memory Use
52 pages
Ece408 Lecture5 CUDA Tiled Matrix Multiplication
No ratings yet
Ece408 Lecture5 CUDA Tiled Matrix Multiplication
31 pages
Assignment Cover Sheet Faculty of Science and Technology
No ratings yet
Assignment Cover Sheet Faculty of Science and Technology
7 pages
CUDA_part-2
No ratings yet
CUDA_part-2
49 pages
CSE 599 I Accelerated Computing - Programming GPUs Lecture 15 (1)
No ratings yet
CSE 599 I Accelerated Computing - Programming GPUs Lecture 15 (1)
42 pages
10 23203a0061pdf
No ratings yet
10 23203a0061pdf
9 pages
Gpu History and Cuda Programming Basics
No ratings yet
Gpu History and Cuda Programming Basics
44 pages
HPC Revision
No ratings yet
HPC Revision
16 pages
cuda_mode_lecture2
No ratings yet
cuda_mode_lecture2
33 pages
Lect11 12 Cuda Threads
No ratings yet
Lect11 12 Cuda Threads
25 pages
Topic GPU1
No ratings yet
Topic GPU1
32 pages
Gpu Cuda 2
No ratings yet
Gpu Cuda 2
72 pages
3D Transformations
No ratings yet
3D Transformations
5 pages
GC3 SW Abstraction-2021Fall-2
No ratings yet
GC3 SW Abstraction-2021Fall-2
15 pages
Prak Grafika Komputer Minggu 12
No ratings yet
Prak Grafika Komputer Minggu 12
10 pages
Multi - Dim
No ratings yet
Multi - Dim
29 pages
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
From Everand
Microsoft Visual Basic Interview Questions: Microsoft VB Certification Review
Equity Press
No ratings yet
20160324_unix
No ratings yet
20160324_unix
12 pages
20160324_ferret
No ratings yet
20160324_ferret
6 pages
OpenMP Workshop Day 1
No ratings yet
OpenMP Workshop Day 1
49 pages
OpenMP Workshop Day 3
No ratings yet
OpenMP Workshop Day 3
91 pages
Mpi Advancedipcmoc15
No ratings yet
Mpi Advancedipcmoc15
69 pages
parallel-io-hdf5
No ratings yet
parallel-io-hdf5
53 pages
Wpnavapt
No ratings yet
Wpnavapt
572 pages
My Screen Recorder Pro
No ratings yet
My Screen Recorder Pro
3 pages
COIT29226 Assign 2 T1-2023 Specification
No ratings yet
COIT29226 Assign 2 T1-2023 Specification
3 pages
Ata 46-20 Atims General
100% (3)
Ata 46-20 Atims General
76 pages
Face Recognition Based Attendance System
No ratings yet
Face Recognition Based Attendance System
9 pages
Harshal Lonare: PHP Web Developer / Website Administrator
No ratings yet
Harshal Lonare: PHP Web Developer / Website Administrator
3 pages
Report CRM
No ratings yet
Report CRM
35 pages
79bd274f6083
No ratings yet
79bd274f6083
5 pages
60 kVA / 48 KW POWERED by Perkins: MPG-60 KVA-PK@50Hz
No ratings yet
60 kVA / 48 KW POWERED by Perkins: MPG-60 KVA-PK@50Hz
4 pages
Yokogawa PH Transmitter
No ratings yet
Yokogawa PH Transmitter
8 pages
Lenze EL HMI Op Instruct
No ratings yet
Lenze EL HMI Op Instruct
84 pages
Porsche Engineering Magazine
100% (1)
Porsche Engineering Magazine
24 pages
Diseños de Power Arqui
No ratings yet
Diseños de Power Arqui
36 pages
Island
No ratings yet
Island
1 page
CECS323 Classic Models Practice SQL
No ratings yet
CECS323 Classic Models Practice SQL
4 pages
Generative Adversarial Learning Architectures And Applications Roozbeh Razavifar instant download
No ratings yet
Generative Adversarial Learning Architectures And Applications Roozbeh Razavifar instant download
86 pages
Jubler PDF
No ratings yet
Jubler PDF
65 pages
A Survey On Self-Supervised Learning: Algorithms, Applications, and Future Trends
No ratings yet
A Survey On Self-Supervised Learning: Algorithms, Applications, and Future Trends
20 pages
Risk Factors in Software Development Phases: European Scientific Journal January 2014
No ratings yet
Risk Factors in Software Development Phases: European Scientific Journal January 2014
21 pages
FireNET Plus Install Manual V1.073
No ratings yet
FireNET Plus Install Manual V1.073
154 pages
2004.02.24 S.B. No. 5472 DATA COLLECTION UNIT (DCU) - REPLACEMENT OF
No ratings yet
2004.02.24 S.B. No. 5472 DATA COLLECTION UNIT (DCU) - REPLACEMENT OF
5 pages
CLASS-XII-INFORMATION TECHNOLOGY-POWER CAPSULE-UNIT-2-OPERATING WEB-new - 2020-21
No ratings yet
CLASS-XII-INFORMATION TECHNOLOGY-POWER CAPSULE-UNIT-2-OPERATING WEB-new - 2020-21
25 pages
ITI Lec 03 6S
No ratings yet
ITI Lec 03 6S
5 pages
New Discovering Mathematics 2024
No ratings yet
New Discovering Mathematics 2024
7 pages
STRATOS Brochure Orthopaedics
No ratings yet
STRATOS Brochure Orthopaedics
8 pages
Rashtriya Raksha University, LAVAD, GUJARAT: ADMISSION 2020-21
No ratings yet
Rashtriya Raksha University, LAVAD, GUJARAT: ADMISSION 2020-21
2 pages
Hexmage - Encounter Balancing in Hex Arena: Jakub Arnold
No ratings yet
Hexmage - Encounter Balancing in Hex Arena: Jakub Arnold
45 pages
Software Developer Questions1
No ratings yet
Software Developer Questions1
2 pages

CUDA-Thread-Indexing-Cheatsheet

Uploaded by

CUDA-Thread-Indexing-Cheatsheet

Uploaded by

CUDA Thread Indexing Cheatsheet

NVIDIA CUDA Thread Model

You might also like