Skip to content

transformers: KIVI-style KV-cache quantization — packed u8 storage, ~4× memory vs f32#2329

Merged
kali merged 3 commits into
sonos:mainfrom
czoli1976:feature/kv-quant
Jun 18, 2026
Merged

transformers: KIVI-style KV-cache quantization — packed u8 storage, ~4× memory vs f32#2329
kali merged 3 commits into
sonos:mainfrom
czoli1976:feature/kv-quant

transformers: NNEF ser/de for QuantizedKvSdpa (registered)

39b5c38
Select commit
Loading
Failed to load commit list.

Select a check to view from the sidebar