Skip to content

Add streaming GTCRN speech enhancement crate (ONNX via tract)#2

Open
czoli1976 wants to merge 3 commits into
mainfrom
claude/gtrcn-audio-codec-rust-Z9HdW
Open

Add streaming GTCRN speech enhancement crate (ONNX via tract)#2
czoli1976 wants to merge 3 commits into
mainfrom
claude/gtrcn-audio-codec-rust-Z9HdW

Conversation

@czoli1976

Copy link
Copy Markdown
Owner

New gtcrn workspace crate runs the streaming GTCRN model frame-by-frame
through tract-onnx (sonos/tract main): optimized plan, a single reused
SimpleState, zero-copy TValue cache cycling, and a sqrt-Hann STFT/ISTFT
front end matching the reference. Output matches the reference enh.wav at
0.9966 correlation. Includes a CLI (gtcrn-enhance) with RTF reporting and a
per-node profiler example.

https://claude.ai/code/session_015Swk2UYJHW24eCpZVuTkdG

claude added 3 commits May 27, 2026 15:49
New `gtcrn` workspace crate runs the streaming GTCRN model frame-by-frame
through tract-onnx (sonos/tract main): optimized plan, a single reused
SimpleState, zero-copy TValue cache cycling, and a sqrt-Hann STFT/ISTFT
front end matching the reference. Output matches the reference enh.wav at
0.9966 correlation. Includes a CLI (gtcrn-enhance) with RTF reporting and a
per-node profiler example.

https://claude.ai/code/session_015Swk2UYJHW24eCpZVuTkdG
The streaming cache updates export as ScatterNd nodes whose indices are
compile-time constants. tract's generic ScatterNd walks every scattered
element through dynamic ndarray views, dominating per-frame cost (~73%).

Add a graph-rewrite pass (rewrite::replace_const_scatternd) that precomputes
flat (dst, src) block offsets once and lowers each such ScatterNd to a custom
FastScatterConst op doing a few copy_from_slice calls. All 18 nodes convert;
output is bit-identical, and per-frame time drops 13.7ms -> 4.2ms (RTF ~0.85
-> ~0.24 on this container). The profiler example applies the pass too.

https://claude.ai/code/session_015Swk2UYJHW24eCpZVuTkdG
Mirrors the onnxruntime measurement methodology (pure per-frame state.run
loop, no STFT, warmup + best-of-5) so tract vs onnxruntime RTF can be compared
apples-to-apples. Confirms ~0.23 RTF after the ScatterNd rewrite.

https://claude.ai/code/session_015Swk2UYJHW24eCpZVuTkdG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants