Skip to main content
Ecosystem News

Low Power HW Accelerator For FP16 Matrix Multiplications For Tight Integration Within RISC-V Cores | Yvan Tortorella, Luca Bertaccini, Davide Rossi, Luca Benini, Francesco Conti

This new technical paper titled “RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs” was published by researchers at University of Bologna and ETH Zurich.

According to their abstract:
“One of the key stumbling stones is the need for parallel floating-point operations, which are considered unaffordable on sub-100 mW extreme-edge SoCs. We tackle this problem with RedMulE (Reduced-precision matrix Multiplication Engine), a parametric low-power hardware accelerator for FP16 matrix multiplications – the main kernel of DL training and inference – conceived for tight integration within a cluster of tiny RISC-V cores based on the PULP (Parallel Ultra-Low-Power) architecture.”

Find the technical paper here. Published April 2022.