Skip to content

Harden Python sandbox transient retries#154

Merged
mstolarzblaxelai merged 1 commit into
mainfrom
codex/python-sandbox-transient-retries
Jun 3, 2026
Merged

Harden Python sandbox transient retries#154
mstolarzblaxelai merged 1 commit into
mainfrom
codex/python-sandbox-transient-retries

Conversation

@mstolarzblaxelai
Copy link
Copy Markdown
Contributor

@mstolarzblaxelai mstolarzblaxelai commented Jun 3, 2026

Summary:

  • Adds shared transient-reset retry handling for idempotent sandbox read/list and upload PUT paths.
  • Keeps non-idempotent process execution and cleanup operations single-attempt.
  • Adds local loopback transport-fault coverage for real httpx connection drops.

Note

Adds a shared transient_retry module with exponential-backoff retry logic for transport-level failures (connection resets, GOAWAY, timeouts). Wraps all idempotent sandbox read/list operations and upload PUT paths in both the async and sync clients. Non-idempotent operations (process start/stop/kill) are intentionally left single-attempt. Adds two new settings (BL_FS_PART_RETRIES, BL_SANDBOX_READ_RETRIES) to configure retry budgets per-operation class.

Written by Mendral for commit af560da.

Copy link
Copy Markdown
Contributor

@mendral-app mendral-app Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The retry logic is correct — is_transient_reset_error properly gates retries on transport-level errors only, the _has_http_response_status guard prevents retrying real HTTP error responses, and closures correctly re-create BytesIO objects on each retry so multipart uploads don't replay an exhausted stream. CI failures (JSONDecodeError on HTML 403 in create_sandbox.py and missing sandbox-openai model) are environment-level issues in code untouched by this PR — safe to re-run.

Tag @mendral-app with feedback or questions. View session

@mstolarzblaxelai mstolarzblaxelai marked this pull request as ready for review June 3, 2026 23:26
@mstolarzblaxelai mstolarzblaxelai merged commit 9da4668 into main Jun 3, 2026
31 of 32 checks passed
@mstolarzblaxelai mstolarzblaxelai deleted the codex/python-sandbox-transient-retries branch June 3, 2026 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant