Skip to content

Add encrypted_content_affinity to config.yaml for future Responses API load balancing #2

@semidark

Description

@semidark

Currently, the LiteLLM proxy configuration does not include the encrypted_content_affinity pre-call check.

For Responses API models (like gpt-5.1-codex and gpt-5.3-codex), if clients chain calls by passing previous_response_id or encrypted reasoning items, these follow-up requests must route to the same Azure deployment that generated the encryption key.

If we ever add multiple deployments per model_name (e.g., load balancing gpt-5.1-codex across both germanywestcentral and swedencentral under the exact same alias), requests will fail with invalid_encrypted_content errors unless affinity routing is enabled.

To fix this when we expand to load balancing:

router_settings:
  optional_pre_call_checks:
    - encrypted_content_affinity

Note: This is not strictly necessary right now because each model_name in openai.tf maps to a single regional deployment. We are tracking this purely for future scale-out scenarios.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions