Skip to content

Talha-Taki002/Rex-V

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rex-V — A Single-Cycle RISC-V Core

A single-cycle RV32I processor core implemented in Verilog, built as a learning project following "Digital Design and Computer Architecture: RISC-V Edition" by Sarah Harris & David Harris.

Rex-V executes 10 of the core RV32I instructions — enough to run real algorithms — and ships with a custom Python assembler and a self-checking testbench that gives you a clean [SUCCESS] or [FAILED] after every simulation run.


What's inside

Rex-V/
├── docs/
│   ├── microarchitecture.png          # Block diagram of the full datapath + control unit
│   └── assembly_program_structure.md  # Full guide to writing programs for Rex-V
│
├── input/
│   ├── assembly_programs/
│   │   ├── bubble_sort.asm            # Sorts 5 integers, verifies smallest is at base address
│   │   ├── iterative_factorial.asm    # Computes 5! via repeated addition (no MUL needed)
│   │   ├── sum_of_array.asm           # Sums 5 integers, verifies total is 150
│   │   ├── find_maximum.asm           # Finds the max in an array of 5, verifies result is 15
│   │   └── linear_search.asm          # Searches for a value, verifies it is at the right index
│   └── machine_codes/                 # Pre-assembled .hex files, ready to load into the simulator
│       ├── bubble_sort.hex
│       ├── iterative_fact.hex
│       ├── sum_of_array.hex
│       ├── find_maximum.hex
│       └── linear_search.hex
│
├── output/
│   └── dump.vcd                       # Wave-form analysis of the processor  
├── scripts/
│   └── assembler.py                   # Custom two-pass assembler (Python 3, no dependencies)
│
├── src/
│   ├── rtl/
│   │   ├── core/
│   │   │   ├── single_cycle_core.v    # Top of the core — wires control unit to datapath
│   │   │   ├── core_datapath.v        # The full datapath (PC, regfile, ALU, extender, muxes)
│   │   │   ├── control_unit.v         # Combines main_decoder + ALU_decoder
│   │   │   ├── main_decoder.v         # Opcode → all control signals
│   │   │   ├── ALU_decoder.v          # funct3/funct7 → ALU operation select
│   │   │   ├── ALU.v                  # Arithmetic & Logic Unit
│   │   │   ├── adder32.v              # 32-bit ripple-carry adder (generate loop over adder cells)
│   │   │   ├── adder.v                # 1-bit full adder cell
│   │   │   ├── extender.v             # Sign-extension for I / S / B / J immediate formats
│   │   │   ├── reg_file.v             # 32 × 32-bit register file (x0 hardwired to 0)
│   │   │   ├── PC.v                   # Program counter register with async active-low reset
│   │   │   └── PC_target.v            # Adder shared by PC+4 and PC+imm targets
│   │   ├── memory/
│   │   │   ├── instruction_memory.v   # 1 KB ROM (256 × 32-bit words), loaded from .hex file
│   │   │   └── data_memory.v          # 4 KB RAM (1024 × 32-bit words)
│   │   └── single_cycle_top.v         # Top-level — connects core, imem, and dmem
│   └── tb/
│       └── single_cycle_top_tb.v      # Self-checking testbench (MMIO-based pass/fail reporting)
│
└── workspace/
    └── run.vvp                        # Compiled simulation binary (generated by iverilog)

The Instruction Set

Rex-V supports exactly these 10 instructions. The assembler will reject anything else.

Instruction Format Operation
add rd, rs1, rs2 R rd = rs1 + rs2
sub rd, rs1, rs2 R rd = rs1 - rs2
and rd, rs1, rs2 R rd = rs1 & rs2
or rd, rs1, rs2 R rd = rs1 | rs2
slt rd, rs1, rs2 R rd = (rs1 < rs2) ? 1 : 0 (signed)
addi rd, rs1, imm I rd = rs1 + imm (imm range: −2048 to +2047)
lw rd, imm(rs1) I rd = Mem[rs1 + imm]
sw rs2, imm(rs1) S Mem[rs1 + imm] = rs2
beq rs1, rs2, lbl B if (rs1 == rs2) PC = PC + offset
jal rd, lbl J rd = PC+4 ; PC = PC + offset

Architecture

The design follows the classic single-cycle Harvard template from the Harris textbook, split into a control unit and a datapath:

             ┌──────────────────────────────────────────┐
             │              Control Unit                 │
             │   ┌──────────────┐   ┌────────────────┐  │
 opcode ────►│   │ Main Decoder │   │  ALU Decoder   │  │
 funct3 ────►│   │              │   │                │  │
funct7b5 ───►│   └──────┬───────┘   └───────┬────────┘  │
  zero ◄─────│          │                   │ ALU_ctrl  │
             └──────────┼───────────────────┼───────────┘
                        │  (8 control sigs) │
                        ▼                   ▼
             ┌────────────────────────────────────────────┐
             │                 Datapath                    │
             │                                            │
             │  ┌──┐   ┌──────┐   ┌─────────┐   ┌─────┐  │
  imem ─────►│  │PC│──►│RegFil│──►│   ALU   │──►│dmem │  │
             │  └──┘   └──────┘   └─────────┘   └─────┘  │
             │          extender                           │
             └────────────────────────────────────────────┘

Key design choices:

  • The ALU is built from a structural 32-bit ripple-carry adder (adder32adder cells) rather than Verilog's + operator. This makes the arithmetic visible at the gate level.
  • SLT is derived from the carry and sign bits of the shared adder — no separate comparator circuit.
  • Register file reads are combinational (asynchronous); writes happen on the rising clock edge.
  • The PC register resets asynchronously to 0x00000000.
  • Both memories use word-aligned access (address bits [N:2]).

The full block diagram with labeled muxes and bus widths is in docs/microarchitecture.png.


How Programs Are Verified

Rex-V doesn't have a display or UART. Instead, the testbench uses an MMIO (Memory-Mapped I/O) port at address 4092. When your program writes to that address, the testbench intercepts the write and uses the value as a status code:

Value written to address 4092 Meaning
1 ✅ PASS
2 ❌ FAIL — general wrong result
3 ❌ FAIL — item not found
any other value ❌ FAIL — treated as an error code

Every program follows the same four-part skeleton:

# 1. MMIO SETUP — always copy this verbatim, always first
addi x30, x0, 2047
addi x30, x30, 2045    # x30 = 4092  (2047 + 2045, because addi maxes out at 2047)

# 2. INIT / SEEDING — set up registers and memory

# 3. ALGORITHM — the actual computation

# 4. VERIFICATION — check result, then report to testbench
    beq x_result, x_expected, pass

fail:
    addi x31, x0, 2
    sw x31, 0(x30)         # Write error code to address 4092
    jal x0, end

pass:
    addi x31, x0, 1
    sw x31, 0(x30)         # Write 1 (PASS) to address 4092

end:
    # assembler auto-injects: beq x0, x0, _auto_halt

The end: label is intentionally empty. The assembler automatically appends beq x0, x0, _auto_halt as the last instruction, and the testbench detects that specific opcode to end the simulation cleanly.


Running a Simulation

You need Icarus Verilog (iverilog) and Python 3. GTKWave is optional for waveform inspection.

# Install on Ubuntu/Debian
sudo apt install iverilog gtkwave

Step 1 — Assemble a program (skip if using a pre-built .hex)

python scripts/assembler.py input/assembly_programs/sum_of_array.asm input/machine_codes/sum_of_array.hex

The assembler validates register names, instruction syntax, immediate ranges, and label resolution, and will tell you the exact line if anything is wrong.

Step 2 — Compile the Verilog

iverilog -o workspace/run.vvp \
    src/rtl/core/*.v \
    src/rtl/memory/*.v \
    src/rtl/single_cycle_top.v \
    src/tb/single_cycle_top_tb.v \

Step 3 — Simulate

vvp workspace/run.vvp +test_file=input/machine_codes/sum_of_array.hex +vcd_path=output/dump.vcd

Expected output:

[INIT] Loading input/machine_codes/sum_of_array.hex into Instruction Memory...

==================================================
  [SUCCESS] REX-V Core passed the software test!
  Execution completed cleanly in 51 cycles.
==================================================

Step 4 — Inspect the waveform (optional)

gtkwave output/dump.vcd

Sample Programs

All five programs are pre-assembled and ready to run. They all produce [SUCCESS].

Program What it does Expected cycles
bubble_sort Sorts [42, 17, 99, 3, 55] ascending, checks smallest is at base 135
iterative_fact Computes 5! via repeated addition (no multiply), checks result is 120 79
sum_of_array Sums [10, 20, 30, 40, 50], checks total is 150 51
find_maximum Finds max of [7, 3, 15, 1, 9], checks result is 15 50
linear_search Searches for 42 in [10, 25, 7, 42, 18], checks it is at index 3 46

Writing Your Own Program

See docs/assembly_program_structure.md for the full guide. The short version:

  • Use raw register names x0x31 (no ABI aliases like t0, sp, ra)
  • No pseudo-instructions — write add x1, x0, x2 instead of mv x1, x2
  • Use x30 for the MMIO address, x31 for the status write (by convention)
  • Always include the four-part skeleton (MMIO setup → init → algorithm → verification)
  • The assembler rejects unsupported instructions with a clear error message

What's Not Implemented (Yet)

This is a learning-oriented subset — not a full RV32I. Notably absent:

  • lui, auipc — upper immediate instructions (needed to address the full 32-bit space)
  • jalr — indirect jump
  • bne, blt, bge, bltu, bgeu — only beq is wired in the control unit
  • Shift instructions (sll, srl, sra, and immediate variants)
  • xor and the remaining I-type arithmetic (slti, andi, ori, etc.)
  • Byte and halfword memory access (lb, lh, sb, sh)
  • CSR instructions, privileged modes, exceptions, interrupts
  • Pipeline stages

Reference

  • Harris, S. & Harris, D.Digital Design and Computer Architecture: RISC-V Edition (Morgan Kaufmann, 2021)
  • RISC-V ISA Specification

About

A minimalist RISC-V (RV32I) Single Cycle Core (Ref. David Harris & Sarrah L. Harris)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors