A single-cycle RV32I processor core implemented in Verilog, built as a learning project following "Digital Design and Computer Architecture: RISC-V Edition" by Sarah Harris & David Harris.
Rex-V executes 10 of the core RV32I instructions — enough to run real algorithms — and ships with a custom Python assembler and a self-checking testbench that gives you a clean [SUCCESS] or [FAILED] after every simulation run.
Rex-V/
├── docs/
│ ├── microarchitecture.png # Block diagram of the full datapath + control unit
│ └── assembly_program_structure.md # Full guide to writing programs for Rex-V
│
├── input/
│ ├── assembly_programs/
│ │ ├── bubble_sort.asm # Sorts 5 integers, verifies smallest is at base address
│ │ ├── iterative_factorial.asm # Computes 5! via repeated addition (no MUL needed)
│ │ ├── sum_of_array.asm # Sums 5 integers, verifies total is 150
│ │ ├── find_maximum.asm # Finds the max in an array of 5, verifies result is 15
│ │ └── linear_search.asm # Searches for a value, verifies it is at the right index
│ └── machine_codes/ # Pre-assembled .hex files, ready to load into the simulator
│ ├── bubble_sort.hex
│ ├── iterative_fact.hex
│ ├── sum_of_array.hex
│ ├── find_maximum.hex
│ └── linear_search.hex
│
├── output/
│ └── dump.vcd # Wave-form analysis of the processor
├── scripts/
│ └── assembler.py # Custom two-pass assembler (Python 3, no dependencies)
│
├── src/
│ ├── rtl/
│ │ ├── core/
│ │ │ ├── single_cycle_core.v # Top of the core — wires control unit to datapath
│ │ │ ├── core_datapath.v # The full datapath (PC, regfile, ALU, extender, muxes)
│ │ │ ├── control_unit.v # Combines main_decoder + ALU_decoder
│ │ │ ├── main_decoder.v # Opcode → all control signals
│ │ │ ├── ALU_decoder.v # funct3/funct7 → ALU operation select
│ │ │ ├── ALU.v # Arithmetic & Logic Unit
│ │ │ ├── adder32.v # 32-bit ripple-carry adder (generate loop over adder cells)
│ │ │ ├── adder.v # 1-bit full adder cell
│ │ │ ├── extender.v # Sign-extension for I / S / B / J immediate formats
│ │ │ ├── reg_file.v # 32 × 32-bit register file (x0 hardwired to 0)
│ │ │ ├── PC.v # Program counter register with async active-low reset
│ │ │ └── PC_target.v # Adder shared by PC+4 and PC+imm targets
│ │ ├── memory/
│ │ │ ├── instruction_memory.v # 1 KB ROM (256 × 32-bit words), loaded from .hex file
│ │ │ └── data_memory.v # 4 KB RAM (1024 × 32-bit words)
│ │ └── single_cycle_top.v # Top-level — connects core, imem, and dmem
│ └── tb/
│ └── single_cycle_top_tb.v # Self-checking testbench (MMIO-based pass/fail reporting)
│
└── workspace/
└── run.vvp # Compiled simulation binary (generated by iverilog)
Rex-V supports exactly these 10 instructions. The assembler will reject anything else.
| Instruction | Format | Operation |
|---|---|---|
add rd, rs1, rs2 |
R | rd = rs1 + rs2 |
sub rd, rs1, rs2 |
R | rd = rs1 - rs2 |
and rd, rs1, rs2 |
R | rd = rs1 & rs2 |
or rd, rs1, rs2 |
R | rd = rs1 | rs2 |
slt rd, rs1, rs2 |
R | rd = (rs1 < rs2) ? 1 : 0 (signed) |
addi rd, rs1, imm |
I | rd = rs1 + imm (imm range: −2048 to +2047) |
lw rd, imm(rs1) |
I | rd = Mem[rs1 + imm] |
sw rs2, imm(rs1) |
S | Mem[rs1 + imm] = rs2 |
beq rs1, rs2, lbl |
B | if (rs1 == rs2) PC = PC + offset |
jal rd, lbl |
J | rd = PC+4 ; PC = PC + offset |
The design follows the classic single-cycle Harvard template from the Harris textbook, split into a control unit and a datapath:
┌──────────────────────────────────────────┐
│ Control Unit │
│ ┌──────────────┐ ┌────────────────┐ │
opcode ────►│ │ Main Decoder │ │ ALU Decoder │ │
funct3 ────►│ │ │ │ │ │
funct7b5 ───►│ └──────┬───────┘ └───────┬────────┘ │
zero ◄─────│ │ │ ALU_ctrl │
└──────────┼───────────────────┼───────────┘
│ (8 control sigs) │
▼ ▼
┌────────────────────────────────────────────┐
│ Datapath │
│ │
│ ┌──┐ ┌──────┐ ┌─────────┐ ┌─────┐ │
imem ─────►│ │PC│──►│RegFil│──►│ ALU │──►│dmem │ │
│ └──┘ └──────┘ └─────────┘ └─────┘ │
│ extender │
└────────────────────────────────────────────┘
Key design choices:
- The ALU is built from a structural 32-bit ripple-carry adder (
adder32→addercells) rather than Verilog's+operator. This makes the arithmetic visible at the gate level. - SLT is derived from the carry and sign bits of the shared adder — no separate comparator circuit.
- Register file reads are combinational (asynchronous); writes happen on the rising clock edge.
- The PC register resets asynchronously to
0x00000000. - Both memories use word-aligned access (address bits
[N:2]).
The full block diagram with labeled muxes and bus widths is in docs/microarchitecture.png.
Rex-V doesn't have a display or UART. Instead, the testbench uses an MMIO (Memory-Mapped I/O) port at address 4092. When your program writes to that address, the testbench intercepts the write and uses the value as a status code:
| Value written to address 4092 | Meaning |
|---|---|
1 |
✅ PASS |
2 |
❌ FAIL — general wrong result |
3 |
❌ FAIL — item not found |
| any other value | ❌ FAIL — treated as an error code |
Every program follows the same four-part skeleton:
# 1. MMIO SETUP — always copy this verbatim, always first
addi x30, x0, 2047
addi x30, x30, 2045 # x30 = 4092 (2047 + 2045, because addi maxes out at 2047)
# 2. INIT / SEEDING — set up registers and memory
# 3. ALGORITHM — the actual computation
# 4. VERIFICATION — check result, then report to testbench
beq x_result, x_expected, pass
fail:
addi x31, x0, 2
sw x31, 0(x30) # Write error code to address 4092
jal x0, end
pass:
addi x31, x0, 1
sw x31, 0(x30) # Write 1 (PASS) to address 4092
end:
# assembler auto-injects: beq x0, x0, _auto_haltThe end: label is intentionally empty. The assembler automatically appends beq x0, x0, _auto_halt as the last instruction, and the testbench detects that specific opcode to end the simulation cleanly.
You need Icarus Verilog (iverilog) and Python 3. GTKWave is optional for waveform inspection.
# Install on Ubuntu/Debian
sudo apt install iverilog gtkwavepython scripts/assembler.py input/assembly_programs/sum_of_array.asm input/machine_codes/sum_of_array.hexThe assembler validates register names, instruction syntax, immediate ranges, and label resolution, and will tell you the exact line if anything is wrong.
iverilog -o workspace/run.vvp \
src/rtl/core/*.v \
src/rtl/memory/*.v \
src/rtl/single_cycle_top.v \
src/tb/single_cycle_top_tb.v \vvp workspace/run.vvp +test_file=input/machine_codes/sum_of_array.hex +vcd_path=output/dump.vcdExpected output:
[INIT] Loading input/machine_codes/sum_of_array.hex into Instruction Memory...
==================================================
[SUCCESS] REX-V Core passed the software test!
Execution completed cleanly in 51 cycles.
==================================================
gtkwave output/dump.vcdAll five programs are pre-assembled and ready to run. They all produce [SUCCESS].
| Program | What it does | Expected cycles |
|---|---|---|
bubble_sort |
Sorts [42, 17, 99, 3, 55] ascending, checks smallest is at base |
135 |
iterative_fact |
Computes 5! via repeated addition (no multiply), checks result is 120 |
79 |
sum_of_array |
Sums [10, 20, 30, 40, 50], checks total is 150 |
51 |
find_maximum |
Finds max of [7, 3, 15, 1, 9], checks result is 15 |
50 |
linear_search |
Searches for 42 in [10, 25, 7, 42, 18], checks it is at index 3 |
46 |
See docs/assembly_program_structure.md for the full guide. The short version:
- Use raw register names
x0–x31(no ABI aliases liket0,sp,ra) - No pseudo-instructions — write
add x1, x0, x2instead ofmv x1, x2 - Use
x30for the MMIO address,x31for the status write (by convention) - Always include the four-part skeleton (MMIO setup → init → algorithm → verification)
- The assembler rejects unsupported instructions with a clear error message
This is a learning-oriented subset — not a full RV32I. Notably absent:
lui,auipc— upper immediate instructions (needed to address the full 32-bit space)jalr— indirect jumpbne,blt,bge,bltu,bgeu— onlybeqis wired in the control unit- Shift instructions (
sll,srl,sra, and immediate variants) xorand the remaining I-type arithmetic (slti,andi,ori, etc.)- Byte and halfword memory access (
lb,lh,sb,sh) - CSR instructions, privileged modes, exceptions, interrupts
- Pipeline stages
- Harris, S. & Harris, D. — Digital Design and Computer Architecture: RISC-V Edition (Morgan Kaufmann, 2021)
- RISC-V ISA Specification