Matrix Multiplication

M.Tech. Project, IISc, 2021

Part of requirement for the course E0243: Computer Architecture

Summary

Optimized Checkered Matrix Multiplication (CMM) using hardware counters on CPU/GPU. Achieved a 1017x speed-up using a GTX 1650 and an optimized CMM over the CPU single-threaded version by analyzing the bottlenecks in the regular MM algorithm.

The project report can be found here.