fix(providers): pin an explicit timeout on Mistral embedding requests#864
fix(providers): pin an explicit timeout on Mistral embedding requests#864cbcoutinho wants to merge 1 commit into
Conversation
MistralProvider relied on the mistralai SDK's internal default timeout for embedding calls. Pass an explicit `timeout_ms` on every `embeddings.create_async` instead, so the bound is intentional and tunable rather than dependent on the SDK's hard-coded per-method default. A client-level `httpx.Timeout(..., connect=...)` is NOT a viable alternative here: the SDK's generated embeddings methods hard-default `timeout_ms` and pass that scalar straight into `httpx.build_request(timeout=...)`, which replaces any client-configured timeout — so an injected httpx client is ignored and a separate connect timeout cannot be expressed (single scalar only). This is the residual of upstream mistralai/client-python#449 (the original `timeout=None` override hang, fixed in v2.3.0; server-side hang tracked in #474). A NOTE in the code records this. We deliberately pin 2.4.5 (2.4.6 was a supply-chain compromise, #523). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Claude finished @cbcoutinho's task in 1m 17s —— View job Code Review
Overall this is a clean, well-motivated fix. The rationale is clearly documented, both call sites are covered, and the approach is correct given the SDK's constraints. A few observations below.
|
|



Summary
MistralProvidermade embedding calls without setting any timeout, leaving the bound entirely to themistralaiSDK's internal default. This change passes an explicittimeout_ms(60s) on everyembeddings.create_asynccall so the timeout is intentional and tunable.This surfaced while investigating why a tenant's hourly vector-sync wasn't picking up newly uploaded files. The scanner/processor was healthy and Qdrant calls were bounded (
qdrant_client.pypassestimeout=30, which httpx applies to all phases incl. connect), but the Mistral embedding call — the one external dependency in the single-worker processor path — had no application-level timeout configured. Since the processor runs single-worker (VECTOR_SYNC_PROCESSOR_WORKERS=1), any stall there serializes and backs up the in-memory ingest queue.Why not configure a client-level connect timeout?
The obvious fix — inject
httpx.AsyncClient(timeout=httpx.Timeout(60.0, connect=5.0))into the SDK — does not work for embeddings:embeddings.create_asynchard-defaultstimeout_ms(60_000) and feeds that scalar intohttpx.build_request(timeout=...), which replaces any client-configuredhttpx.Timeout. An injected client is silently ignored.timeout_msis a single scalar, so a separateconnecttimeout can't be expressed via this SDK at all.This is the residual of upstream mistralai/client-python#449 (the original
timeout=None-overrides-the-client hang, fixed in v2.3.0; the server-side hang is tracked in #474). ANOTEinmistral.pyrecords the rationale so the next person doesn't re-attempt the injected-client route.We deliberately pin
mistralai==2.4.5(2.4.6 was a supply-chain compromise — #523), so making the bound explicit here keeps it independent of the SDK's internal default.Changes
providers/mistral.py: add_EMBED_TIMEOUT_MSconstant (with explanatory NOTE) and pass it on bothembeddings.create_asynccalls.tests/unit/providers/test_mistral.py: asserttimeout_msis forwarded.Testing
ruff check/ruff format --check/ty check— cleanpytest tests/unit/providers/ -q— 24 passedThis PR was generated with the help of AI, and reviewed by a Human