Tinytinytpu: 2×2 Systolic-array Tpu-style Matrix-multiply Unit...
A minimal 2×2 systolic-array TPU-style matrix-multiply unit, implemented in SystemVerilog and deployed on FPGA.
This project implements a complete TPU architecture including:
TinyTinyTPU is an educational implementation of Google's TPU architecture, scaled down to a 2×2 systolic array. It demonstrates:
This is a minimal, educational-scale TPU designed for:
For production workloads, scale up the array size (e.g., 256×256 like Google TPU v1).
All simulation commands must be run from the sim/ directory:
The project includes a Python driver for communicating with the FPGA:
The gesture_demo.py script implements a simple gesture classifier:
See host/tpu_driver.py for full protocol implementation.
The MLP controller manages sequential layer processing:
Source: HackerNews