Skip to content

fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials#174

Open
ravibits wants to merge 3 commits into
mainfrom
fix/bug/sk-819-provider-error-handling
Open

fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials#174
ravibits wants to merge 3 commits into
mainfrom
fix/bug/sk-819-provider-error-handling

Conversation

@ravibits

@ravibits ravibits commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes SK-819 — blind retries and missing provider error differentiation for tool execution errors.

  • No retry for provider errors: grpc_exec now checks error_code == "TOOL_ERROR" from gRPC ErrorInfo details before any retry decision. Provider errors raise immediately — no more 3x retry amplification on provider 429s.
  • New exception hierarchy: ScalekitToolException base (extends ScalekitServerException) with typed subclasses via multiple inheritance for full backward + forward compat:
    • ScalekitToolRateLimitException(ScalekitToolException, ScalekitTooManyRequestsException) — provider 429
    • ScalekitToolUnauthorizedException(ScalekitToolException, ScalekitUnauthorizedException) — provider 401
    • ScalekitToolForbiddenException(ScalekitToolException, ScalekitForbiddenException) — provider 403
    • ScalekitToolException directly — any other provider error
  • Scalekit 429 surfaces immediately: RESOURCE_EXHAUSTED without TOOL_ERROR no longer retried — caller owns retry strategy.
  • M2M refresh only for Scalekit 401: UNAUTHENTICATED without TOOL_ERROR still refreshes token and retries. Provider 401 raises ScalekitToolUnauthorizedException immediately.

Backward compatibility

Existing except ScalekitTooManyRequestsException blocks still catch ScalekitToolRateLimitException via inheritance. No customer code changes required unless they want the new granular exception types.

Behavior change to note in release notes: Provider errors previously retried up to 3 times now surface immediately. Scalekit 429s also surface immediately instead of retrying.

Test plan

  • 21 new tests in tests/test_sk819_retry_behavior.py — all passing
  • Phase 1: baseline behavior documented and verified before implementation
  • Phase 2: new behavior tests written first (failing), then implementation made them pass
  • Full existing test suite passes

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added upsert functionality for connected accounts with automatic synchronization
    • Introduced new tool-specific exception classes to improve handling of provider rate-limits, authorization failures, and access restrictions
  • Bug Fixes

    • Corrected retry behavior to prevent unnecessary retries on provider rate-limit and authorization errors
    • Enhanced error handling for resource exhaustion to avoid incorrect retry attempts

…s for provider errors

- Add ScalekitToolException base class with tool_error_code, tool_error_message, execution_id properties
- Add ScalekitToolRateLimitException (multiple-inherits ScalekitToolException + ScalekitTooManyRequestsException)
- Add ScalekitToolUnauthorizedException (multiple-inherits ScalekitToolException + ScalekitUnauthorizedException)
- Add ScalekitToolForbiddenException (multiple-inherits ScalekitToolException + ScalekitForbiddenException)
- Update promote() to detect error_code == TOOL_ERROR and raise the appropriate tool exception
- Fix grpc_exec: TOOL_ERROR errors raise immediately (no retry, no M2M refresh)
- Fix grpc_exec: plain RESOURCE_EXHAUSTED (Scalekit 429) now surfaces immediately (no retry)
- Add _extract_error_code() helper to read error_code from gRPC trailing metadata without full exception construction
- Add 21-test suite covering Phase 1 baseline and Phase 2 new behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

The SDK adds a tool-error exception hierarchy (ScalekitToolException and three subclasses) with a new _extract_error_code helper that reads gRPC trailing metadata. CoreClient.grpc_exec gains early-exit branches for TOOL_ERROR and RESOURCE_EXHAUSTED. ActionClient.get_or_create_connected_account gains an upsert path for authorization_details, aliased as upsert_connected_account. Version bumped to 2.13.0.

Changes

Tool Error Exception Hierarchy and gRPC Routing

Layer / File(s) Summary
ScalekitToolException hierarchy and promote routing
scalekit/common/exceptions.py
Adds _extract_error_code to parse ErrorInfo trailing metadata. Updates ScalekitServerException.promote to route TOOL_ERROR codes to new subclasses. Defines ScalekitToolException with tool_error_code, tool_error_message, and execution_id properties, plus ScalekitToolRateLimitException, ScalekitToolUnauthorizedException, and ScalekitToolForbiddenException as mixed-in subclasses.
CoreClient.grpc_exec early-exit branches
scalekit/core.py
Imports ScalekitTooManyRequestsException. Adds a pre-check that immediately raises via promote when TOOL_ERROR is detected, bypassing retry/refresh. Adds an explicit RESOURCE_EXHAUSTED branch that also raises immediately.
SK-819 tests and version bump
tests/test_sk819_retry_behavior.py, scalekit/_version.py
New test module with fake grpc.RpcError builders and a CoreClient bypass helper. Phase 1 asserts M2M refresh for Scalekit UNAUTHENTICATED and immediate raises for provider errors. Phase 2 validates exception hierarchy, metadata parsing, and no-retry behavior. Version bumped to 2.13.0.

Connected Account Upsert Behavior

Layer / File(s) Summary
Upsert path and alias in ActionClient
scalekit/actions/actions.py
In the existing-account branch of get_or_create_connected_account, calls update_connected_account when authorization_details is present and returns a CreateConnectedAccountResponse wrapping the updated result. Adds upsert_connected_account as a public alias.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant CoreClient
    participant ScalekitServerException
    participant ToolExceptions

    Caller->>CoreClient: grpc_exec(fn, *args)
    CoreClient->>CoreClient: fn(*args) raises grpc.RpcError
    CoreClient->>ScalekitServerException: _extract_error_code(exp)
    alt error_code == "TOOL_ERROR"
        ScalekitServerException-->>CoreClient: "TOOL_ERROR"
        CoreClient->>ScalekitServerException: promote(exp)
        ScalekitServerException->>ToolExceptions: ScalekitToolRateLimitException / ScalekitToolUnauthorizedException / ScalekitToolForbiddenException
        ToolExceptions-->>Caller: raises immediately (no retry)
    else StatusCode.RESOURCE_EXHAUSTED
        CoreClient->>ScalekitServerException: promote(exp)
        ScalekitServerException-->>Caller: raises ScalekitTooManyRequestsException immediately
    else StatusCode.UNAUTHENTICATED (no TOOL_ERROR)
        CoreClient->>CoreClient: __authenticate_client() refresh
        CoreClient->>CoreClient: grpc_exec(fn, *args, retry-1)
        CoreClient-->>Caller: result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • scalekit-inc/scalekit-sdk-python#165: Extends connected-account request/response models with authorization_details.trusted_idp, directly supporting the update_connected_account call added in this PR's upsert path.

Suggested reviewers

  • AkshayParihar33
  • Avinash-Kamath

Poem

🐇 Hop, hop — the errors now have names,
No more retries for provider-blamed flames!
TOOL_ERROR lands with grace and style,
Rate limits stop after just one trial.
Upsert accounts with a single bound —
v2.13.0, the best SDK around! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the two main issues (SK-819 provider error differentiation/retry fix and SK-821 credential upsert) and concisely summarizes the key changes: error handling, retry behavior, and credentials upsert functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/bug/sk-819-provider-error-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

…when account exists

When an account already exists, the "get" path previously returned it
as-is and silently dropped any authorization_details provided by the caller.

Fix: if authorization_details is provided and the account already exists,
call update_connected_account to apply the credentials regardless of the
account's current status (PENDING_AUTH, ACTIVE, EXPIRED, DISCONNECTED).

Also adds upsert_connected_account as an alias on ActionClient —
identical signature, preferred name going forward.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ravibits ravibits changed the title fix(SK-819): add ScalekitToolException hierarchy and fix blind retries for provider errors fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials Jun 23, 2026
New ScalekitTool* exception hierarchy, upsert_connected_account alias,
and changed grpc_exec retry behavior for provider errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ravibits ravibits marked this pull request as ready for review June 23, 2026 13:27
@ravibits ravibits requested a review from Avinash-Kamath as a code owner June 23, 2026 13:27

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
scalekit/common/exceptions.py (1)

106-121: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low value

Avoid silently swallowing parse failures in _extract_error_code.

The blind except Exception: pass means any failure while reading the trailing metadata makes this return None, which causes grpc_exec to fall through to its retry/refresh path — i.e. the precise behavior SK-819 is trying to eliminate. At minimum log the swallowed exception so a parsing regression is debuggable. The local from grpc_status import rpc_status is also redundant since rpc_status is already used at module scope (Line 95).

♻️ Suggested change
     `@staticmethod`
     def _extract_error_code(error: grpc.RpcError) -> str | None:
         """ Extract error_code from gRPC trailing metadata without constructing a full exception """
-        from grpc_status import rpc_status
         try:
             status = rpc_status.from_call(error)
             if status is None:
                 return None
             for detail in status.details:
                 info = ErrorInfo()
                 detail.Unpack(info)
                 if info.error_code:
                     return info.error_code
-        except Exception:
-            pass
+        except Exception:
+            logger.debug("Failed to extract error_code from gRPC trailing metadata", exc_info=True)
         return None

As per static analysis hints (Ruff S110 try-except-pass and BLE001 blind except).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/common/exceptions.py` around lines 106 - 121, Replace the bare
`except Exception: pass` block in the `_extract_error_code` method with proper
exception handling that logs the swallowed exception to aid debugging of parsing
regressions. Additionally, remove the local `from grpc_status import rpc_status`
import statement inside the method since `rpc_status` is already imported at the
module scope and can be reused directly.

Source: Linters/SAST tools

scalekit/core.py (1)

166-170: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Early-exit routing is correct; consider raise ... from exp for cleaner tracebacks.

The TOOL_ERROR pre-check and explicit RESOURCE_EXHAUSTED branch correctly bypass retry/refresh. Minor: re-raising inside the except block without from chains an implicit __context__; Ruff flags this on Lines 169 and 181.

♻️ Suggested change
             if error_code == "TOOL_ERROR":
-                raise ScalekitServerException.promote(exp)
+                raise ScalekitServerException.promote(exp) from exp
...
             elif exp.code() == grpc.StatusCode.RESOURCE_EXHAUSTED:
                 # Surface Scalekit rate-limits immediately — retrying triples the damage
-                raise ScalekitServerException.promote(exp)
+                raise ScalekitServerException.promote(exp) from exp

As per static analysis hints (Ruff B904).

Also applies to: 179-181

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/core.py` around lines 166 - 170, When re-raising exceptions in the
except block, use explicit exception chaining with the `from` keyword to provide
clearer tracebacks. In the TOOL_ERROR handling block where
ScalekitServerException.promote(exp) is raised, change the raise statement to
use `raise ScalekitServerException.promote(exp) from exp` instead of the current
implicit chaining. Apply the same fix to the similar exception re-raise pattern
at lines 179-181 as indicated by the review comment. This makes the exception
chain explicit and follows Ruff B904 guidelines.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scalekit/actions/actions.py`:
- Around line 794-809: The docstring for the api_config parameter (around line
775) currently states it is "only used when creating", but the new upsert logic
now forwards api_config to update_connected_account when authorization_details
are supplied, meaning it also applies to existing accounts. Update the
api_config parameter docstring to clarify that it is used for both creating new
connected accounts and updating existing ones through the upsert mechanism, so
callers understand that providing api_config could modify an existing account's
configuration.
- Around line 794-809: The docstring for the `api_config` parameter (around line
775) currently states it is "only used when creating," but the upsert path now
passes `api_config` to the `update_connected_account` method when
`authorization_details` are supplied. Update the docstring to clarify that
`api_config` is used both when creating and when updating connected accounts in
the upsert scenario.

---

Nitpick comments:
In `@scalekit/common/exceptions.py`:
- Around line 106-121: Replace the bare `except Exception: pass` block in the
`_extract_error_code` method with proper exception handling that logs the
swallowed exception to aid debugging of parsing regressions. Additionally,
remove the local `from grpc_status import rpc_status` import statement inside
the method since `rpc_status` is already imported at the module scope and can be
reused directly.

In `@scalekit/core.py`:
- Around line 166-170: When re-raising exceptions in the except block, use
explicit exception chaining with the `from` keyword to provide clearer
tracebacks. In the TOOL_ERROR handling block where
ScalekitServerException.promote(exp) is raised, change the raise statement to
use `raise ScalekitServerException.promote(exp) from exp` instead of the current
implicit chaining. Apply the same fix to the similar exception re-raise pattern
at lines 179-181 as indicated by the review comment. This makes the exception
chain explicit and follows Ruff B904 guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cc52783d-a7b7-46ce-8d64-cdbd423393d9

📥 Commits

Reviewing files that changed from the base of the PR and between 6945518 and 97803cf.

📒 Files selected for processing (5)
  • scalekit/_version.py
  • scalekit/actions/actions.py
  • scalekit/common/exceptions.py
  • scalekit/core.py
  • tests/test_sk819_retry_behavior.py

Comment on lines +794 to +809

# True upsert: if credentials were supplied, apply them regardless of
# the account's current status (PENDING_AUTH, ACTIVE, EXPIRED, DISCONNECTED).
if authorization_details:
update_response = self.update_connected_account(
connection_name=connection_name,
identifier=identifier,
authorization_details=authorization_details,
organization_id=organization_id,
user_id=user_id,
api_config=api_config
)
return CreateConnectedAccountResponse(connected_account=update_response.connected_account)

return CreateConnectedAccountResponse(connected_account=existing_response.connected_account)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Update the api_config docstring — it is no longer "only used when creating".

The new upsert branch forwards api_config to update_connected_account (Line 804), so api_config now also applies to existing accounts. The parameter doc at Line 775 (only used when creating) is now misleading and could cause callers to unknowingly mutate an existing account's API config.

📝 Suggested docstring update (Line 775)
-        :param api_config: Optional API configuration for the connected account (optional, only used when creating)
+        :param api_config: Optional API configuration for the connected account. Applied when
+            creating a new account, and also when updating an existing account if
+            ``authorization_details`` are supplied. (optional)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/actions/actions.py` around lines 794 - 809, The docstring for the
api_config parameter (around line 775) currently states it is "only used when
creating", but the new upsert logic now forwards api_config to
update_connected_account when authorization_details are supplied, meaning it
also applies to existing accounts. Update the api_config parameter docstring to
clarify that it is used for both creating new connected accounts and updating
existing ones through the upsert mechanism, so callers understand that providing
api_config could modify an existing account's configuration.

🩺 Stability & Availability | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect the three response models' constructors and connected_account fields.
fd -t f -e py | xargs rg -nP -C3 '\bclass\s+(CreateConnectedAccountResponse|GetConnectedAccountAuthResponse|UpdateConnectedAccountResponse)\b'
echo '--- connected_account references in these classes ---'
fd -t f -e py | xargs rg -nP -C2 '\bconnected_account\b'

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 50390


🏁 Script executed:

sed -n '770,810p' scalekit/actions/actions.py

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 1989


🏁 Script executed:

# Check if update_connected_account accepts api_config parameter
rg -A 15 'def update_connected_account' scalekit/actions/actions.py | head -25

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 727


Update docstring to reflect api_config usage in upsert path.

The docstring at line 775 states that api_config is "only used when creating," but the upsert branch (lines 801–806) now passes api_config to update_connected_account when authorization_details are supplied. Update the docstring to clarify that api_config is used when creating or updating.

The response object contracts are sound: CreateConnectedAccountResponse, GetConnectedAccountAuthResponse, and UpdateConnectedAccountResponse all expose a .connected_account attribute of type ConnectedAccount, and all constructors accept it as a keyword argument.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/actions/actions.py` around lines 794 - 809, The docstring for the
`api_config` parameter (around line 775) currently states it is "only used when
creating," but the upsert path now passes `api_config` to the
`update_connected_account` method when `authorization_details` are supplied.
Update the docstring to clarify that `api_config` is used both when creating and
when updating connected accounts in the upsert scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants