This project aims to design and integrate the integer “Multiplication and Division Unit (MDU)” with SERV core as a co-processor. The MDU will support all the instructions that are provided in “‘The RISC-V Instruction Set Manual. Volume I: Unprivileged ISA’ chapter 7, M-Standard Extension for Integer Multiplication and Division, Version 2.0”. This MDU is to be designed in a way that it would use most of the DSP resources on the targeted FPGA board. It would be generic and parameterized so that it can easily be integrated with any of the RISC-V cores. Read the post on Zeeshan’s blog.
RISC-V is a free and Open-Source Instruction Set Architecture for 32, 64, and 128 bit CPUs, it comes with different extensions in hardware like A for “Atomic”, B for “Bit-Manipulation”, C for Compressed and likewise M for “Multiplication & division” etc. all these extensions can be integrated with the base ISA RV32I (RISC-V 32-bit Integer).
Here we have one of the RISC-V CPU called SERV, it is the world’s smallest bit-serial 32-bit CPU which is famous because of its size and lumbering pace as it performs its most operations in 32 clocks cycles. It implements a simplified wishbone interface to connect it with instruction and data memory. The interesting thing is that it is small enough to run Zephyr RTOS. It is formally verified with RISC-V Formal and is freely available under the BSD license.
FuseSoC is an award-winning package manager and a set of build tools for HDL (Hardware Description Language) code. Its main purpose is to increase the reuse of IP (Intellectual Property) cores and be an aid for creating, building, and simulating SoC solutions. It also played an important role in the integration of SERV with MDU.
To simplify the low-end implementations RISC-V has decided to separate the integer multiplication and division. There are eight instructions in this extension for 32-bit implementation, four for multiplication, and the other four for division and remainder. The instructions follow the R-type format of the base ISA and the same R-type opcode which helps in integration with RV-I.
Multiplication & Division Unit:
MDU is designed in a way so that it can be easily integrated with the existing RISC-V cores which do not implement M-extension already in hardware. It is open-source and free available under Apache 2.0 license. It communicates with the core using a ready/valid interface whereas `mdu_op` decides which operation MDU has to perform out of eight instructions.
There are four instructions for multiplication but we use one multiplier in hardware for all of them by aligning the operands (signed/unsigned) according to the instruction in the prep section before multiplication.
Hardware for division is most costly if we use “/” this for division, so to reduce the hardware cost we have chosen the division block from picorv32 which implements division with the help of shifters and basic logical operations.
MDU integration with SERV:
SERV has two types of instruction flows, one stage and two stages, most operations are one-stage operations that finish in 32 cycles + fetch overhead and there are four types of two-stage operations; memory, shift, slt, and branch operations. As MDU is outside of the SERV so we can relate MDU operations with load (feeding operand) and store (getting the result), hence we put MDU in two-stage operations.
Minor changes are made in RTL to integrate MDU with SERV. There are 11 affected files out of which 7 files are from RTL, 2 core files, Verilator waiver, and a README file.
MDU takes two 32 bits operands form the SERV, so we reuse the existing signals which were designed to use for load/store instructions instead of taking them directly from the register file.
First of all, it is important to decode the upcoming instruction so changes are made in the decode unit to ensure the arrival of MDU instruction by checking the opcode and 25th bit of the instruction (function7’s lsb). In decode, it is also important to enable the write signal for the register file to notify that this instruction will write back its result back to the Register file.
Ready valid signals are handled in serv_state module, it checks whether the MDU is ready for the next transaction or not, SERV only starts a transaction when the MDU is in ready state, SERV keeps stalling until the arrival of valid signal in response.
As the SERV is a bit-serial CPU so after getting the MDU instruction it takes 32 cycles to register the data into a 32-bit register and then transmit, the same happens while receiving the result from MDU.
Here is the complete life-cycle of an MDU instruction:
SERV can be run with or without MDU, we just have to set the MDU flag in the FuseSoC build command to have MDU. We added MDU in the servant core file as a dependency rather than is serv core:
depend : [serv, "mdu? (mdu)"]
To follow more changes please review this PR #60.
How to clone and run?
Create a directory to keep all the different parts of the project together. We will refer to this directory as $WORKSPACE from now on. All commands will be run from this directory unless otherwise stated.
pip install fusesoc
Add SERV as a separate library into the workspace
fusesoc library add serv https://github.com/olofk/serv
Now add MDU
fusesoc library add mdu https://github.com/zeeshanrafique23/mdu
If Verilator is installed, we can use that as a linter to check the SERV source code
fusesoc run --target=lint serv
If everything worked, the output should look like
INFO: Preparing ::serv:1.1.0 INFO: Setting up project INFO: Building simulation model INFO: Running
Now it’s time to run SERV with MDU
fusesoc run --target=verilator_tb --flag=mdu servant
Note: All the FuseSoC commands should run from $workspace.
MDU integration with your RISC-V core:
After making changes in RTL of your Core you can integrate MDU with your it, make sure you have implemented the correct ready/valid interface.
Here is the instantiation template:
mdu_top #( .WIDTH(32) )i_mdu_top( .i_clk, .i_rst, .i_mdu_rs1, .i_mdu_rs2, .i_mdu_op, .i_mdu_valid, .o_mdu_ready, .o_mdu_rd );
Testing and verification:
MDU testing is done with the integration of SERV core, it is formally varified with RISC-V compliance rv32im and we run the Dining philosopher program is run on Arty-a7 35t FPGA board to check the real experience of real hardware emulation.
MDU should be integrated in a way in SoC where one MDU can be shared with many SERV cores, an arbiter could help in this case.
Other extensions integration with SERV like Compressed instruction which will decrease the size of the SoC as memories get smaller.
The MDU is successfully integrated with SERV and passed all the RISC-V compliance tests of RV32I and RV32IM as well. FuseSoC played a great role in this project, it helped while running the simulation and remove the dependency of MDU to exist inside the SERV repo.
At last, I would like to thanks my mentors Stefan Wallentowitz and Olof Kindgren for guiding me throughout the project and making most of the communication on the SERV gitter channel so others can follow the progress too.