Skip to content

Refactor packet_handler RTL for synthesis quality and handshake correctness#6

Open
kiro-agent[bot] wants to merge 9 commits into
jules-experimentingfrom
feat/rtl-synthesis-cleanup
Open

Refactor packet_handler RTL for synthesis quality and handshake correctness#6
kiro-agent[bot] wants to merge 9 commits into
jules-experimentingfrom
feat/rtl-synthesis-cleanup

Conversation

@kiro-agent
Copy link
Copy Markdown

@kiro-agent kiro-agent Bot commented May 20, 2026

This pull request was created by @kiro-agent on behalf of @RafailDD 👻

Comment with /kiro fix to address specific feedback or /kiro all to address everything.
Learn about Kiro autonomous agent


Summary

Refactor packet_handler.v for synthesis quality and handshake correctness, based on the jules-experimenting branch tip.

Synthesizability improvements

Change Why it matters
Move fifo_mem and packetTracker writes into their own reset-free always blocks Standard RAM-inference template — lets synthesis map both memories to distributed/block RAM instead of resettable flip-flops.
Replace the for (i=0; i<32; i=i+1) packetTracker[i] <= 0; reset loop with a 32-bit tracker_valid register Saves ~1024 reset-bearing flip-flops; only 32 FFs (tracker_valid) are reset; the data array is RAM-inference friendly.
Bound the packetTracker index to 5 bits (in_streamId[4:0] - 1) Previously a 16-bit subtraction indexed a 32-deep array — out-of-range behaviour was undefined.
fifo_full = fifo_count[FIFO_AW] instead of fifo_count == 9'd256 One bit lookup vs. 9-bit compare.
(* fsm_encoding = "one_hot" *) attribute on state Preserves the intended one-hot encoding through Vivado/Synplify; other tools ignore unknown attributes.
next_state = state; default at top of comb block Defensive against latch inference and non-onehot states; default branch still recovers to IDLE.
Removed o_data <= o_data; self-assignment Idiomatic clock-enable form.
Parameterised FIFO_DEPTH / NUM_STREAMS / STREAM_IDX_W Easier to retarget and read.

Functional fixes

Both fixes were verified by simulating the original and refactored RTL against packet_handler_tb.v with Icarus Verilog 12.0. The o_packetLost pulse is bit-identical at t = 235 ns, and o_valid timing matches; only o_data differs (correctly).

  1. AXI-stream handshake compliance. o_data is now loaded on the DATA -> DONE transition, so it is valid in the same cycle that o_valid rises. Previously o_data was only assigned in DONE when i_ready == 1, meaning a receiver sampling on (o_valid && i_ready) would capture the previous transaction's payload (or zeros after reset) on the first cycle of the handshake.

    • Original sim, first transaction: validOut=1, dataOut=000…0  :x:
    • New sim, first transaction: validOut=1, dataOut=…5a22b2d45a22b49e42e3c71e43995b7e  :white_check_mark:
  2. o_packetLost recovers after a gap. The tracker now stores the actual received seqNumber instead of unconditionally incrementing by 1. Combined with the new per-stream tracker_valid bit, the first packet of any stream is unconditionally accepted, and after a discontinuity the next sequential packet stops the pulse. Previously, once a packet was lost, every subsequent packet for that stream was incorrectly flagged as lost forever.

Verification

  • iverilog -g2005-sv -Wall packet_handler.v packet_handler_tb.v — clean (no warnings).
  • vvp of original vs. refactored design against the existing testbench — $monitor outputs match except for the o_data and msgLength/streamId differences described above (the second is internal-only and tagged lint_off UNUSEDSIGNAL).
  • o_packetLost pulses fire at the same time on the wrong-seqNumber packet.

Known follow-ups (not in this PR)

  • The FIFO uses a combinational read (fifo_data = fifo_mem[fifo_rd_ptr]), which favours distributed RAM inference. To target block RAM (BRAM) for the 256x33-bit FIFO, the FSM would need to be reworked for one-cycle read latency. Out of scope for this synthesis-quality pass.
  • The output o_data is 296 bits, but {shiftReg[263:0], i_data} only carries 9 full 32-bit words = 288 bits of payload through the shift register. Packets with the maximum msgLength = 45 (37 bytes payload) would lose the last byte. Existing limitation, not regressed here.

Target

This PR targets jules-experimenting (used as the basis), as requested.

google-labs-jules Bot and others added 9 commits April 9, 2026 06:55
- Replaces simplistic Verilog test bench with robust Python test bench.
- Adds `test_packet_handler.py` containing a comprehensive suite: basic, multiple streams, dropped packet detection, min length packets, and randomized testing.
- Includes a cocotb `Makefile` to run tests with Icarus Verilog.

Co-authored-by: RafailDD <123392637+RafailDD@users.noreply.github.com>

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
- Replaces simplistic Verilog test bench with robust Python test bench.
- Adds `test_packet_handler.py` containing a comprehensive suite: basic, multiple streams, dropped packet detection, min length packets, and randomized testing.
- Includes a cocotb `Makefile` to run tests with Icarus Verilog.

Co-authored-by: RafailDD <123392637+RafailDD@users.noreply.github.com>

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
- Replaces simplistic Verilog test bench with robust Python test bench.
- Adds `test_packet_handler.py` containing a comprehensive suite: basic, multiple streams, dropped packet detection, min length packets, and randomized testing.
- Includes a cocotb `Makefile` to run tests with Icarus Verilog.
- Fixes cocotb deprecation warnings (MODULE -> COCOTB_TEST_MODULES, units -> unit).

Co-authored-by: RafailDD <123392637+RafailDD@users.noreply.github.com>

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
…67122689535779

Add extensive cocotb test bench for packet handler
Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
Adds a 256-depth synchronous FIFO at the input to buffer incoming data
and allow continuous ingestion, decoupled from the FSM processing speed.
This prevents stalling the transmitter when the internal FSM is processing
or waiting for receiver readiness. The packet tracking logic to assert
o_packetLost has been adapted to evaluate stream sequences as data is pushed
into the FIFO.

Fixes an issue where o_data was partially initialized to 32 bits instead of 296.

Co-authored-by: RafailDD <123392637+RafailDD@users.noreply.github.com>

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
…375545090

Add Input FIFO to Packet Handler
Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
Creates a SystemVerilog class-based testbench that structurally mimics UVM without relying on the official IEEE 1800.2 UVM libraries, ensuring compatibility with open-source tools like Verilator.

Includes:
- Pseudo-randomization constraints via `packet_item` class methods.
- Generators, drivers, and scoreboards using SV classes.
- A functional coverage proxy inside the monitor that samples packet drops, minimum-length packets, and maximum-length packets.
- A dedicated `Makefile.sv` for standalone Verilator compilation and execution.

Co-authored-by: RafailDD <123392637+RafailDD@users.noreply.github.com>

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
…780058197

Add UVM-style pure SystemVerilog testbench for FOSS compatibility
Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
…ctness

Synthesizability:
* Move FIFO and packetTracker memory writes into reset-free always blocks
  so synthesis can map them to distributed/block RAM instead of resettable
  flip-flops.
* Replace the for-loop reset of packetTracker[31:0] (1024 FFs) with a
  per-stream tracker_valid bit array (32 FFs) that suppresses spurious
  o_packetLost on the first packet of each stream.
* Bound the packetTracker index to STREAM_IDX_W=5 bits; previously the
  16-bit (in_streamId-1) was used, allowing out-of-range access.
* Compute fifo_full from fifo_count[FIFO_AW] (single bit) instead of an
  equality compare to 9'd256.
* Tag the FSM with (* fsm_encoding = "one_hot" *) to preserve one-hot
  encoding through synthesis (Vivado-style hint, ignored elsewhere).
* Add next_state = state default at the top of the combinational block;
  default branch in the case still recovers to IDLE.
* Replace o_data <= o_data self-assignment with implicit hold.

Functional fixes (verified bit-identical o_packetLost timing on the
existing testbench):
* o_data is now loaded on the DATA -> DONE transition so it is valid in
  the same cycle that o_valid rises. Previously the receiver would sample
  stale data on the first handshake cycle (o_data was only assigned on
  i_ready==1 inside DONE).
* Sequence-number tracker now stores the actual received seqNumber, so
  after a discontinuity the next valid packet recovers cleanly. The
  previous behaviour incremented the tracker by 1 unconditionally and
  reported every subsequent packet as lost once a discontinuity occurred.

Verification: simulated original and refactored designs against
packet_handler_tb.v with iverilog. o_packetLost pulses identically at
t=235ns. o_valid timing is identical. o_data is correct (was 0 in the
original) during the first transaction's DONE wait.

Co-authored-by: Rafail Dalakouras <123392637+RafailDD@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants