RV32I 5-stage in-order pipeline. This document captures the architecture so the RTL and testbenches stay consistent as the project grows, this page is updated and written by Claude(Sonnet 4.6) so when someone else trotting in this repo would understand it.
┌──────┐ IF/ID ┌──────┐ ID/EX ┌──────┐ EX/MEM ┌──────┐ MEM/WB ┌──────┐
clk → │ IF │────────▶│ ID │────────▶│ EX │─────────▶│ MEM │─────────▶│ WB │
└──────┘ └──────┘ └──────┘ └──────┘ └──────┘ ▲ │ branchTarget / branchTaken
└──────────────────────────────────┘ (flush IF/ID on taken branch or jump) ▲ hazard_unit: stall on load-use
five stages, all in rtl/core/ and wired together in rtl/minori5_top.sv.
holds the program counter. on each cycle:
- if
rst: PC ← 0 - if
flush: PC ←pcNext(branch/jump target from EX) - if
stall: PC holds (load-use bubble) - else: PC ← PC + 4
outputs pc and pcPlus4. the top reads imem[pc[11:2]] and latches it into the IF/ID register.
purely combinational. decodes the 32-bit instruction into control signals:
| signal | meaning |
|---|---|
rs1Addr / rs2Addr / rdAddr |
register file addresses |
imm |
sign-extended immediate (I/S/B/U/J format) |
aluOp |
4-bit ALU operation (see table below) |
aluSrcA |
ALU operand A mux select (see below) |
aluSrcB |
0 = rs2, 1 = immediate |
regWrite |
write enable for WB |
memRead / memWrite |
DMEM access |
memSize |
funct3 forwarded directly to MEM (000=byte, 001=half, 010=word) |
wbSel |
writeback mux: 00=ALU, 01=load, 10=PC+4 |
branch / jump / isJalr |
control-flow type |
funct3 |
forwarded to EX for branch condition selection |
aluSrcA mux:
2'b00— rs1 (normal)2'b01— PC (JAL/AUIPC uses PC as base)2'b10— zero (LUI: 0 + imm)
- applies forwarding muxes (forwardA / forwardB from hazard_unit) to select the freshest rs1/rs2 values
- feeds the ALU with the selected srcA and srcB
- computes branch target:
PC + imm(orrs1 + immfor JALR) - asserts
branchTakenwhen the branch condition holds
forwarding codes:
2'b00— use pipeline register value (no hazard)2'b01— forward from MEM/WB ALU result2'b10— forward from EX/MEM ALU result
generates byte-enable signals (memWe[3:0]) for byte/half/word stores. loads sign- or zero-extend the read data according to memSize / funct3. dmem is a simple 4 KB word array in the top.
selects the writeback value via wbSel:
2'b00— ALU result2'b01— load data2'b10— PC+4 (JAL / JALR return address)
op[3:0] |
operation |
|---|---|
0000 |
AND |
0001 |
OR |
0010 |
ADD |
0011 |
XOR |
0100 |
SLL (shift left logical) |
0101 |
SLTU (set less than unsigned) |
0110 |
SUB + signed compare (BEQ/BNE/BLT/BGE) |
0111 |
SUB + unsigned compare (BLTU/BGEU) |
1000 |
SRL (shift right logical) |
1010 |
SLT (set less than signed) |
1100 |
SRA (shift right arithmetic) |
the compare outputs (conBlt, conBgt, zero) are used by EX to compute branchTaken.
op[2:0] |
operation |
|---|---|
000 |
MUL iXi |
001 |
MULH iXi upper |
010 |
MULHSU iXu upper |
011 |
MULHU uXu upper |
100 |
DIV i/i |
101 |
DIVU u/u |
110 |
REM i%i |
111 |
REMU i%i |
checks EX/MEM and MEM/WB destination registers against the current EX source registers. priority: EX/MEM > MEM/WB > none.
when the instruction in EX is a load (idExMemRead high) and its destination matches either source of the instruction in ID, the hazard unit asserts stall for one cycle — freezing IF and ID while inserting a bubble into EX.
when (idExBranch && branchTaken) || idExJump, flush is asserted. the top flushes IF/ID (zeroes the register), and IF loads pcNext from EX.
signals crossing a pipeline boundary follow the pattern <from><to>_<signal>:
| prefix | boundary |
|---|---|
ifId |
IF → ID latch |
idEx |
ID → EX latch |
exMem |
EX → MEM latch |
memWb |
MEM → WB latch |
example: idExRs1Addr is the rs1 address captured in the ID/EX register.
| file | module | description |
|---|---|---|
adder.sv |
adder |
parameterised ripple adder, WIDTH=8 default |
dff.sv |
dff |
single-bit D flip-flop with async active-low reset |
mux2.sv |
mux2 |
2-input mux |
bus_if.sv |
bus_if |
placeholder memory bus interface |
these are standalone learning exercises and are not yet instantiated in minori5_top.
| region | size | contents |
|---|---|---|
imem[0:1023] |
4 KB | instruction memory, word-addressed, loaded from program.hex |
dmem[0:1023] |
4 KB | data memory, byte-enable writes |
both are simple arrays inside minori5_top. a real bus interface (bus_if.sv) will replace this when AXI or Wishbone is added.
| group | instructions | status |
|---|---|---|
| R-type | ADD SUB AND OR XOR SLL SRL SRA SLT SLTU | decoded + executed |
| I-type ALU | ADDI ANDI ORI XORI SLLI SRLI SRAI SLTI SLTIU | decoded + executed |
| Load | LB LH LW LBU LHU | decoded + executed |
| Store | SB SH SW | decoded + executed |
| Branch | BEQ BNE BLT BGE BLTU BGEU | decoded + executed |
| Jump | JAL JALR | decoded + executed |
| Upper | LUI AUIPC | decoded + executed |
| System | ECALL EBREAK | not yet implemented |
| M ext | MUL DIV REM | not yet implemented |
| V ext | — | not yet implemented |
- Implement a simple adder
- Implement pipeline
- Implement necessary modules for
I - Implement necessary modules for
M - Implement necessary modules for
A - Implement necessary modules for
F - Implement necessary modules for
D - Implement necessary modules for
Zicsr - Implement necessary modules for
Zifenci - Implement necessary modules for
V - Implement HW interface to FPGA, more info TBD
- Write demostration program TBD