Skip to content

fix(providers): pin an explicit timeout on Mistral embedding requests#864

Open
cbcoutinho wants to merge 1 commit into
masterfrom
fix/mistral-embed-connect-timeout
Open

fix(providers): pin an explicit timeout on Mistral embedding requests#864
cbcoutinho wants to merge 1 commit into
masterfrom
fix/mistral-embed-connect-timeout

Conversation

@cbcoutinho

Copy link
Copy Markdown
Owner

Summary

MistralProvider made embedding calls without setting any timeout, leaving the bound entirely to the mistralai SDK's internal default. This change passes an explicit timeout_ms (60s) on every embeddings.create_async call so the timeout is intentional and tunable.

This surfaced while investigating why a tenant's hourly vector-sync wasn't picking up newly uploaded files. The scanner/processor was healthy and Qdrant calls were bounded (qdrant_client.py passes timeout=30, which httpx applies to all phases incl. connect), but the Mistral embedding call — the one external dependency in the single-worker processor path — had no application-level timeout configured. Since the processor runs single-worker (VECTOR_SYNC_PROCESSOR_WORKERS=1), any stall there serializes and backs up the in-memory ingest queue.

Why not configure a client-level connect timeout?

The obvious fix — inject httpx.AsyncClient(timeout=httpx.Timeout(60.0, connect=5.0)) into the SDK — does not work for embeddings:

  • The SDK's generated embeddings.create_async hard-defaults timeout_ms (60_000) and feeds that scalar into httpx.build_request(timeout=...), which replaces any client-configured httpx.Timeout. An injected client is silently ignored.
  • timeout_ms is a single scalar, so a separate connect timeout can't be expressed via this SDK at all.

This is the residual of upstream mistralai/client-python#449 (the original timeout=None-overrides-the-client hang, fixed in v2.3.0; the server-side hang is tracked in #474). A NOTE in mistral.py records the rationale so the next person doesn't re-attempt the injected-client route.

We deliberately pin mistralai==2.4.5 (2.4.6 was a supply-chain compromise — #523), so making the bound explicit here keeps it independent of the SDK's internal default.

Changes

  • providers/mistral.py: add _EMBED_TIMEOUT_MS constant (with explanatory NOTE) and pass it on both embeddings.create_async calls.
  • tests/unit/providers/test_mistral.py: assert timeout_ms is forwarded.

Testing

  • ruff check / ruff format --check / ty check — clean
  • pytest tests/unit/providers/ -q — 24 passed

This PR was generated with the help of AI, and reviewed by a Human

MistralProvider relied on the mistralai SDK's internal default timeout for
embedding calls. Pass an explicit `timeout_ms` on every `embeddings.create_async`
instead, so the bound is intentional and tunable rather than dependent on the
SDK's hard-coded per-method default.

A client-level `httpx.Timeout(..., connect=...)` is NOT a viable alternative
here: the SDK's generated embeddings methods hard-default `timeout_ms` and pass
that scalar straight into `httpx.build_request(timeout=...)`, which replaces any
client-configured timeout — so an injected httpx client is ignored and a
separate connect timeout cannot be expressed (single scalar only). This is the
residual of upstream mistralai/client-python#449 (the original `timeout=None`
override hang, fixed in v2.3.0; server-side hang tracked in #474). A NOTE in the
code records this. We deliberately pin 2.4.5 (2.4.6 was a supply-chain
compromise, #523).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jun 5, 2026

Copy link
Copy Markdown

Claude finished @cbcoutinho's task in 1m 17s —— View job


Code Review

  • Gather context and understand the request
  • Read changed files (mistral.py, test_mistral.py)
  • Review for correctness, style, test coverage, and security
  • Post final review

Overall this is a clean, well-motivated fix. The rationale is clearly documented, both call sites are covered, and the approach is correct given the SDK's constraints. A few observations below.


providers/mistral.py

_EMBED_TIMEOUT_MS placement and value (mistral.py:47)

The constant correctly shadows the SDK's opaque internal default with an explicit, auditable value. 60 s is reasonable for an embedding call and matches the SDK's own default, so this is a no-op in terms of actual behavior today — the value of the change is making the bound intentional and making future tuning obvious. That's exactly the right move.

NOTE comment (mistral.py:30-46)

This is a textbook example of a comment that earns its place: it explains why client-level timeout injection silently fails, cites the upstream issue numbers, and records the supply-chain pin rationale. Nothing to cut here.

Coverage of both call sites (mistral.py:122, mistral.py:174)

Both embeddings.create_async calls now receive timeout_ms.


tests/unit/providers/test_mistral.py

Single-embed assertion (test_mistral.py:56-60)

assert_awaited_once_with(..., timeout_ms=_EMBED_TIMEOUT_MS) is the right call — it pins the full kwargs contract, not just return values.

Minor gap — batch path not asserted

test_mistral_embedding_batch_single_call only checks await_count == 1; it doesn't verify timeout_ms was forwarded. Similarly, test_mistral_embedding_batch_chunking uses **_kwargs in the side-effect, which silently absorbs timeout_ms without asserting it. If timeout_ms were accidentally dropped from _embed_batch_request, both tests would still pass.

A straightforward fix for test_mistral_embedding_batch_single_call:

mock_mistral_client.embeddings.create_async.assert_awaited_once_with(
    model="mistral-embed",
    inputs=["a", "b", "c"],
    timeout_ms=_EMBED_TIMEOUT_MS,
)

And for the chunking test, checking at least one call's kwargs:

for call in mock_mistral_client.embeddings.create_async.await_args_list:
    assert call.kwargs["timeout_ms"] == _EMBED_TIMEOUT_MS

Fix this →


Minor observations (nits, not blockers)

  • _EMBED_TIMEOUT_MS is a private constant but imported directly in tests. That's fine for unit testing — no need to expose a public alias.
  • The value is not yet overridable via environment variable (e.g. MISTRAL_EMBED_TIMEOUT_MS). The PR description notes it's "tunable," but tuning currently requires a code change. This is reasonable for now given the scope — worth noting if there's appetite for operator-level configurability later.

Summary: The fix is correct and well-documented. The only actionable gap is adding timeout_ms assertions to the batch-path tests so that both code paths are fully covered. Everything else is solid.

@sonarqubecloud

sonarqubecloud Bot commented Jun 5, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant