A hands-on educational simulator that demonstrates how debuggers work internally — for both CPU (x86/Linux) and GPU (AMD wavefront) targets.
This project has no production value. It exists solely to teach debugger internals through working code you can read, modify, and run.
The ptrace command demonstrates real Linux CPU debugging mechanics:
fork()creates a child processPTRACE_TRACEMEopts the child into being traced by its parentexecveloads a real program (/bin/ls) into the childPTRACE_SINGLESTEPexecutes one CPU instruction at a time using the hardware Trap Flagwaitpid()blocks the parent until the child stopsWIFEXITED()detects when the child has finished
This is the exact mechanism GDB uses to debug CPU programs. The only difference
is that GDB also injects INT3 (0xCC) for breakpoints and reads DWARF for
source-level information — concepts also covered in this project.
The interactive simulator models an AMD GPU wavefront — 64 lanes executing the same instruction in lockstep (SIMT). It demonstrates:
- DWARF address mapping: instruction addresses are mapped to source file:line,
mirroring how a real debugger reads the
.debug_linesection of an ELF binary - Software breakpoints: a breakpoint address list is checked on each instruction,
mirroring how ROCdbgapi injects
s_trapinto GPU memory - Wavefront halting: when a breakpoint is hit, the wavefront thread blocks on a condition variable — mirroring how the GPU halts and notifies KFD
- Event queue: the wavefront pushes events (
BreakpointHit,WavefrontExited) that the debugger pops, mirroringamd_dbgapi_next_pending_event() - EXEC mask: a 64-bit bitmask where each bit represents one lane (1=active,
0=masked). The
divergecommand masks off lanes 32-63 to simulate warp divergence - Per-lane registers: each lane has independent VGPR values (
v0,v1,pc), mirroring how VGPRs hold different values per thread in a real wavefront
This simulator Real ROCgdb stack
──────────────────── ──────────────────────────────
main.cpp (CLI loop) ←→ ROCgdb command-line interface
Debugger class ←→ ROCgdb + ROCdbgapi library
EventQueue ←→ amd_dbgapi_next_pending_event()
Wavefront thread ←→ GPU hardware + KFD kernel driver
DwarfMap ←→ DWARF .debug_line section in ELF
ptrace_target.cpp ←→ GDB's CPU debugging via ptrace()
debugger_simulator/
├── CMakeLists.txt
├── README.md
├── src/
│ ├── event_queue.hpp — thread-safe queue (condition_variable + mutex)
│ ├── dwarf_map.hpp/.cpp — address → file:line lookup (binary search)
│ ├── target.hpp/.cpp — 64-lane wavefront + execution thread
│ ├── debugger.hpp/.cpp — GDB-style commands, owns the wavefront
│ ├── ptrace_target.hpp/.cpp — real ptrace instruction counter
│ └── main.cpp — interactive command loop
└── tests/
└── test_dwarf_map.cpp — unit tests for DwarfMap (Google Test style)
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug
make./debugger_simulatorrun launch the 64-lane wavefront
break <line> set a breakpoint at kernel.hip:<line> (b <line>)
delete <n> delete breakpoint number n (d <n>)
continue resume after a breakpoint hit (c)
where show current file:line from DWARF (w)
regs print all 64 lanes: v0, v1, pc, active/masked
lane <n> print one specific lane (0-63) (l <n>)
info list all set breakpoints (i)
diverge mask off lanes 32-63 (simulate divergence)
converge restore all 64 lanes to active
ptrace run /bin/ls under ptrace, count instructions
help show this list
quit exit (q)
(gds) break 3
breakpoint 1 at 0x1010 (kernel.hip:3)
(gds) break 7
breakpoint 2 at 0x1030 (kernel.hip:7)
(gds) run
[wavefront launched — 64 lanes, 11 instructions]
[breakpoint hit at 0x1010] kernel.hip:3
(gds) where
pc = 0x1010 → kernel.hip:3
(gds) lane 0
lane 0 [active] v0=0x0 v1=0x0 pc=0x1010
(gds) diverge
divergence applied: lanes 32-63 masked off
(gds) continue
[wavefront resumed]
[breakpoint hit at 0x1030] kernel.hip:7
(gds) regs
EXEC mask : 0x00000000ffffffff ← upper 32 lanes masked
...
(gds) continue
[wavefront resumed]
[wavefront finished — all instructions executed]
(gds) ptrace
/bin/ls executed 142503 instructions
./test_dwarf # from the build directory
# or
ctest| Concept | Where in Code |
|---|---|
| DWARF address→line mapping | DwarfMap, binary search with std::upper_bound |
| Software breakpoints (s_trap / INT3) | run_wavefront — address list check |
| Wavefront halting on breakpoint | halt_mutex + halt_cv in Wavefront |
| Event-driven debugger loop | EventQueue, wait_for_stop() |
| EXEC mask / lane masking | exec_mask, cmd_diverge(), cmd_converge() |
| Per-lane VGPR values | Lane::v0, Lane::v1 — different per lane |
| RAII thread management | Debugger::~Debugger() joins the thread |
| Thread-safe queue | EventQueue — mutex + condition_variable |
| ptrace CPU debugging | ptrace_target.cpp — fork/TRACEME/SINGLESTEP |
| CMake project structure | CMakeLists.txt — multiple targets, CTest |