0% found this document useful (0 votes)
15 views

GPU - LAB - Ipynb - Colaboratory

This document compares the performance of matrix multiplication operations using NumPy on the CPU vs Cupy on the GPU. It generates random matrices of varying sizes and times the dot product of the matrices. For smaller matrices of size 50x50, the GPU implementation is about 1.9x faster. For larger 500x500 matrices, the GPU is over 300x faster. Finally, for very large 2000x2000 matrices, the GPU version finishes in 14.7 milliseconds compared to 12.9 seconds for the CPU version, a speedup of around 880x. This shows that GPUs provide significant performance benefits over CPUs for linear algebra and matrix operations, especially on large problem sizes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

GPU - LAB - Ipynb - Colaboratory

This document compares the performance of matrix multiplication operations using NumPy on the CPU vs Cupy on the GPU. It generates random matrices of varying sizes and times the dot product of the matrices. For smaller matrices of size 50x50, the GPU implementation is about 1.9x faster. For larger 500x500 matrices, the GPU is over 300x faster. Finally, for very large 2000x2000 matrices, the GPU version finishes in 14.7 milliseconds compared to 12.9 seconds for the CPU version, a speedup of around 880x. This shows that GPUs provide significant performance benefits over CPUs for linear algebra and matrix operations, especially on large problem sizes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

GPU_LAB.ipynb - Colaboratory https://colab.research.google.com/drive/1222GuvwsBst...

1 import cupy as cp

1 import numpy as np

1 n = 50
2 A = np.random.randint(0, 255, size=(n, n))
3 B = np.random.randint(0, 255, size=(n, n))
4 C = np.random.randint(0, 255, size=(n, n))

1 A_gpu = cp.random.randint(0, 255, size=(n, n))
2 B_gpu = cp.random.randint(0, 255, size=(n, n))
3 C_gpu = cp.random.randint(0, 255, size=(n, n))

1 %%timeit
2 A_dash = np.dot(A, A+B) + C

112 µs ± 22.1 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

1 %%timeit
2 A_dash_gpu = cp.dot(A_gpu, A_gpu + B_gpu) + C_gpu

59.4 µs ± 13 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

1 n = 500
2 A = np.random.randint(0, 255, size=(n, n))
3 B = np.random.randint(0, 255, size=(n, n))
4 C = np.random.randint(0, 255, size=(n, n))

1 A_gpu = cp.random.randint(0, 255, size=(n, n))
2 B_gpu = cp.random.randint(0, 255, size=(n, n))
3 C_gpu = cp.random.randint(0, 255, size=(n, n))

1 %%timeit 
2 A_dash = np.dot(A, A+B) + C

167 ms ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

1 %%timeit
2 A_dash_gpu = cp.dot(A_gpu, A_gpu + B_gpu) + C_gpu

479 µs ± 461 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

1 n = 2000

1 of 2 4/17/23, 3:39 PM
GPU_LAB.ipynb - Colaboratory https://colab.research.google.com/drive/1222GuvwsBst...

 Executing (1m 59s)  <cell line: 2>randint()randint()_interval()_get_indices()


2 A = np.random.randint(0, 255, size=(n, n))
3 B = np.random.randint(0, 255, size=(n, n))
4 C = np.random.randint(0, 255, size=(n, n))

1 A_gpu = cp.random.randint(0, 255, size=(n, n))
2 B_gpu = cp.random.randint(0, 255, size=(n, n))
3 C_gpu = cp.random.randint(0, 255, size=(n, n))

1 %%timeit 
2 A_dash = np.dot(A, A+B) + C

12.9 s ± 640 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

1 %%timeit
2 A_dash_gpu = cp.dot(A_gpu, A_gpu + B_gpu) + C_gpu

14.7 ms ± 322 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Colab paid products - Cancel contracts here

2 of 2 4/17/23, 3:39 PM

You might also like