fix(pricing): resolve MiniMax-M3 (provider prefix + pre-registration)#1186
Conversation
PR governanceThis PR follows the template and is marked ready for human review. |
The proxy's cost dashboard reports $0.00 for every MiniMax call because:
1. resolve_litellm_model() has no 'minimax/' provider prefix
2. The existing prefix check is case-sensitive; MiniMax uses mixed-case
model names like 'MiniMax-M3'.
Fix both, and pre-register 'MiniMax-M3' in litellm.model_cost as
'minimax/MiniMax-M3' at module load (safety net for cold cache).
Closes: cost dashboard showing $0 savings despite real compression.
1844f88 to
515963c
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
JerrettDavis
left a comment
There was a problem hiding this comment.
The pricing fix itself is the right direction: the mixed-case MiniMax resolver path is covered, and the bare-name pre-registration gives estimate_cost("MiniMax-M3", ...) a direct lookup path.
The branch cannot merge while lint is red, though. The latest lint job has ruff check . passing, then fails at formatting:
Would reformat: tests/test_pricing_litellm.py
Please run ruff format tests/test_pricing_litellm.py and push the formatted file so CI can go green.
JerrettDavis
left a comment
There was a problem hiding this comment.
The formatting blocker is resolved and the MiniMax pricing change looks good. The resolver now matches MiniMax-M3 case-insensitively through the minimax/ prefix, and the bare-name pre-registration makes estimate_cost("MiniMax-M3", ...) work even when callers do not resolve a prefixed LiteLLM key first. The tests cover both the resolver and the pre-registration safety net without clobbering a customized bare entry.
Lint/build/e2e are green. The current red test shards I inspected on similar heads are the shared HuggingFace cache issue in memory tests, not this pricing path.
Description
Fixes the cost dashboard reporting
$0.00for every call when the upstream model isMiniMax-M3(Anthropic-compatible endpoint served from theMiniMaxprovider).Two root causes in
headroom/pricing/litellm_pricing.py:resolve_litellm_model()had nominimax/provider prefix. LiteLLM's community pricing database stores MiniMax-M3 only underminimax/MiniMax-M3. The resolver never tried that prefix, so callers inproxy/cost.py,proxy/savings_tracker.py, andperf/analyzer.pysilently fell back to the unresolved name.MiniMax-M3), but every existing prefix pattern (claude-,gpt-,o1-, …) was lowercase, so even after adding"minimax-"the bareMiniMax-M3wouldn't match.This PR fixes both, plus adds a
_register_minimax_pricing()helper that pre-populateslitellm.model_cost["MiniMax-M3"]fromminimax/MiniMax-M3at module load — a safety net soestimate_cost()(which doesn't know theminimax/prefix internally) succeeds even on a cold resolver cache or if LiteLLM drops the prefixed entry in a future release.Net change: +97 / −1 lines across 2 files (one production file + one test file).
Closes #
Type of Change
Changes Made
headroom/pricing/litellm_pricing.py):"minimax-": "minimax/"to the provider-prefix table in_resolve_litellm_model_uncached()so the resolver knows about the MiniMax provider.model_lower = model.lower()and match prefixes against it instead ofmodel, so the mixed-case bare nameMiniMax-M3resolves correctly. The existing prefixes (claude-,gpt-,o1-,o3-,o4-,gemini-) are already lowercase patterns matched against canonical lowercase names (claude-sonnet-4-5-…,gpt-4o,gemini-2.0-flash) — no regression._register_minimax_pricing(): ifminimax/MiniMax-M3is inlitellm.model_costandMiniMax-M3is not, copy the pricing dict under the bare key. No-op on older LiteLLM (entry missing) or when the user has already customisedMiniMax-M3._register_minimax_pricing()once at module import.tests/test_pricing_litellm.py):test_litellm_minimax_mixed_case_with_provider_prefix— verifiesresolve_litellm_model("MiniMax-M3")returns"minimax/MiniMax-M3"via the case-insensitive prefix match.test_litellm_minimax_preregistration_safety_net— verifies the pre-registration populates the bareMiniMax-M3key, thatestimate_cost()returns the correct dollar figure (0.84for 1M in + 100k out), and that a user-customised bare entry is never clobbered.Testing
pytest)ruff check .)mypy headroom)Test Output
Manual reproducer (matches the PR writeup):
Real Behavior Proof
headroom-aiinstalled editable viauvfrom this branch,litellmpulled from PyPI on first run.git checkout fix/minimax-pricing && uv sync --all-extras --dev, run (1)uv run python -c "from headroom.pricing.litellm_pricing import resolve_litellm_model, estimate_cost; import litellm; print(resolve_litellm_model('MiniMax-M3'), 'MiniMax-M3' in litellm.model_cost, estimate_cost('MiniMax-M3', 1_000_000, 100_000))", then (2)uv run pytest tests/test_pricing_litellm.py -v, then (3)uv run ruff check headroom/pricing/litellm_pricing.py tests/test_pricing_litellm.py, then (4)uv run mypy headroom/pricing/litellm_pricing.py.resolve_litellm_model('MiniMax-M3')returnsminimax/MiniMax-M3(wasMiniMax-M3, unresolved);'MiniMax-M3' in litellm.model_costisTrue(proves_register_minimax_pricing()ran);estimate_cost('MiniMax-M3', 1_000_000, 100_000)returns0.84(matches$0.60/M in × 1M + $2.40/M out × 0.1M). (2) All 7 tests intests/test_pricing_litellm.pypass (5 pre-existing + 2 new MiniMax-specific). (3)ruffreportsAll checks passed!. (4)mypyreportsSuccess: no issues found in 1 source file.MiniMax-M3endpoint — no Anthropic-compatible key configured in this environment. The reproducer exercises the exact code path the proxy's cost accumulator uses, but I did not point the proxy at a real upstream.Review Readiness
Checklist
litellm_pricing.pydirectly; the only public API affected (estimate_cost) now returns correct values for a previously-unsupported model.Additional Notes
estimate_cost()callsget_model_pricing()directly, andget_model_pricing()has its own hardcoded prefix list["openai/", "anthropic/", "google/", "mistral/", "deepseek/"]that does not includeminimax/. So the prefix resolver alone is not enough forestimate_cost("MiniMax-M3")to return a non-Nonenumber — the pre-registration step is what makes the bare name resolve. The prefix resolver change matters for the proxy's cost/savings/perf code paths that callresolve_litellm_model()and then look up the prefixed string themselves.startswith()is a no-op for them. Only the new"minimax-"entry uses a mixed-case input._register_minimax_pricing()mirrors upstream LiteLLM (input $0.60/M, output $2.40/M, cache read $0.12/M as of 2026-06). Re-check after LiteLLM updates; the function already short-circuits when the user has customised the entry.tests/test_pricing_litellm.py. Wider CI will catch anything I missed.