Currently, the LiteLLM proxy configuration does not include the encrypted_content_affinity pre-call check.
For Responses API models (like gpt-5.1-codex and gpt-5.3-codex), if clients chain calls by passing previous_response_id or encrypted reasoning items, these follow-up requests must route to the same Azure deployment that generated the encryption key.
If we ever add multiple deployments per model_name (e.g., load balancing gpt-5.1-codex across both germanywestcentral and swedencentral under the exact same alias), requests will fail with invalid_encrypted_content errors unless affinity routing is enabled.
To fix this when we expand to load balancing:
router_settings:
optional_pre_call_checks:
- encrypted_content_affinity
Note: This is not strictly necessary right now because each model_name in openai.tf maps to a single regional deployment. We are tracking this purely for future scale-out scenarios.
Currently, the LiteLLM proxy configuration does not include the
encrypted_content_affinitypre-call check.For Responses API models (like
gpt-5.1-codexandgpt-5.3-codex), if clients chain calls by passingprevious_response_idor encrypted reasoning items, these follow-up requests must route to the same Azure deployment that generated the encryption key.If we ever add multiple deployments per
model_name(e.g., load balancinggpt-5.1-codexacross bothgermanywestcentralandswedencentralunder the exact same alias), requests will fail withinvalid_encrypted_contenterrors unless affinity routing is enabled.To fix this when we expand to load balancing:
Note: This is not strictly necessary right now because each
model_nameinopenai.tfmaps to a single regional deployment. We are tracking this purely for future scale-out scenarios.