Matrix Multiplication
M.Tech. Project, IISc, 2021
Part of requirement for the course E0243: Computer Architecture
Summary
Optimized Checkered Matrix Multiplication (CMM) using hardware counters on CPU/GPU. Achieved a 1017x speed-up using a GTX 1650 and an optimized CMM over the CPU single-threaded version by analyzing the bottlenecks in the regular MM algorithm.
The project report can be found here.