Skip to content

[Feature Request]: Collect prefix cache hit rate metrics during benchmark #970

@Potterluo

Description

@Potterluo

🚀 The feature, motivation and pitch

I understand that vLLM provides a metrics interface for cache hit rates. Could llmperf implement the collection of hit rate data during testing?

Reference implementation: https://github.com/rayn-zzz/aisbench_auto_tools_prefix/tree/main

Reference logs:

2026-05-19 12:18:31,301 - INFO - [Completed] Prefix dataset testing completed
2026-05-19 12:18:37,385 - INFO - ----------------------prefix cache metrics: engine 0----------------------
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] Prefix cache queried tokens: 44460
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] Prefix cache hit tokens: 14592
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] Prefix cache hit rate (hit tokens / queried tokens): 32.82%
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] External queried tokens: 29868
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] External hit tokens: 21888
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 0] External hit rate (hit tokens / queried tokens): 73.28%
2026-05-19 12:18:37,385 - INFO - ----------------------prefix cache metrics: engine 1----------------------
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] Prefix cache queried tokens: 14820
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] Prefix cache hit tokens: 0
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] Prefix cache hit rate (hit tokens / queried tokens): 0.00%
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] External queried tokens: 14820
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] External hit tokens: 7296
2026-05-19 12:18:37,385 - INFO - [prefix cache metrics: engine 1] External hit rate (hit tokens / queried tokens): 49.23%
2026-05-19 12:18:37,419 - INFO - Successfully appended aisbench.log content to aisbench_all.log

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions