Nvidia has presented a scalable multi-chip module (MCM) inference research chip. The 4.01TOPS deep neural network (DNN) accelerator scales to 32 chips and 128TOPS, and is therefore suitable for applications ranging from mobile devices to the data center. The chip is fabricated on 16nm and was presented at the 2019 VLSI Symposium.
To cover a multitude of power, performance, application and form factor variables, Nvidia designed its inference research chip as a scalable MCM. A single chiplet has an area of just 6mm2 and contains 87 million transistors. The 36-chiplet MCM hence contains 216mm2 of silicon.
Each die contains 16 processing elements (PEs) connected via a network-on-chip (NoC), accounting for roughly half the area of the die. The rest of the die is composed of a network-on-package (NoP) router, a 64kB global buffer that acts as second-level storage, and a RISC-V control processor.
To read more, please visit: https://www.tomshardware.com/news/nvidia-msm-inference-chip,39780.html