fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials by ravibits · Pull Request #174 · scalekit-inc/scalekit-sdk-python

ravibits · 2026-06-23T12:19:24Z

Summary

Fixes SK-819 — blind retries and missing provider error differentiation for tool execution errors.

No retry for provider errors: grpc_exec now checks error_code == "TOOL_ERROR" from gRPC ErrorInfo details before any retry decision. Provider errors raise immediately — no more 3x retry amplification on provider 429s.
New exception hierarchy: ScalekitToolException base (extends ScalekitServerException) with typed subclasses via multiple inheritance for full backward + forward compat:
- ScalekitToolRateLimitException(ScalekitToolException, ScalekitTooManyRequestsException) — provider 429
- ScalekitToolUnauthorizedException(ScalekitToolException, ScalekitUnauthorizedException) — provider 401
- ScalekitToolForbiddenException(ScalekitToolException, ScalekitForbiddenException) — provider 403
- ScalekitToolException directly — any other provider error
Scalekit 429 surfaces immediately: RESOURCE_EXHAUSTED without TOOL_ERROR no longer retried — caller owns retry strategy.
M2M refresh only for Scalekit 401: UNAUTHENTICATED without TOOL_ERROR still refreshes token and retries. Provider 401 raises ScalekitToolUnauthorizedException immediately.

Backward compatibility

Existing except ScalekitTooManyRequestsException blocks still catch ScalekitToolRateLimitException via inheritance. No customer code changes required unless they want the new granular exception types.

Behavior change to note in release notes: Provider errors previously retried up to 3 times now surface immediately. Scalekit 429s also surface immediately instead of retrying.

Test plan

21 new tests in tests/test_sk819_retry_behavior.py — all passing
Phase 1: baseline behavior documented and verified before implementation
Phase 2: new behavior tests written first (failing), then implementation made them pass
Full existing test suite passes

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added upsert functionality for connected accounts with automatic synchronization
- Introduced new tool-specific exception classes to improve handling of provider rate-limits, authorization failures, and access restrictions
Bug Fixes
- Corrected retry behavior to prevent unnecessary retries on provider rate-limit and authorization errors
- Enhanced error handling for resource exhaustion to avoid incorrect retry attempts

…s for provider errors - Add ScalekitToolException base class with tool_error_code, tool_error_message, execution_id properties - Add ScalekitToolRateLimitException (multiple-inherits ScalekitToolException + ScalekitTooManyRequestsException) - Add ScalekitToolUnauthorizedException (multiple-inherits ScalekitToolException + ScalekitUnauthorizedException) - Add ScalekitToolForbiddenException (multiple-inherits ScalekitToolException + ScalekitForbiddenException) - Update promote() to detect error_code == TOOL_ERROR and raise the appropriate tool exception - Fix grpc_exec: TOOL_ERROR errors raise immediately (no retry, no M2M refresh) - Fix grpc_exec: plain RESOURCE_EXHAUSTED (Scalekit 429) now surfaces immediately (no retry) - Add _extract_error_code() helper to read error_code from gRPC trailing metadata without full exception construction - Add 21-test suite covering Phase 1 baseline and Phase 2 new behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-23T12:19:34Z

Walkthrough

The SDK adds a tool-error exception hierarchy (ScalekitToolException and three subclasses) with a new _extract_error_code helper that reads gRPC trailing metadata. CoreClient.grpc_exec gains early-exit branches for TOOL_ERROR and RESOURCE_EXHAUSTED. ActionClient.get_or_create_connected_account gains an upsert path for authorization_details, aliased as upsert_connected_account. Version bumped to 2.13.0.

Changes

Tool Error Exception Hierarchy and gRPC Routing

Layer / File(s)	Summary
ScalekitToolException hierarchy and promote routing `scalekit/common/exceptions.py`	Adds `_extract_error_code` to parse `ErrorInfo` trailing metadata. Updates `ScalekitServerException.promote` to route `TOOL_ERROR` codes to new subclasses. Defines `ScalekitToolException` with `tool_error_code`, `tool_error_message`, and `execution_id` properties, plus `ScalekitToolRateLimitException`, `ScalekitToolUnauthorizedException`, and `ScalekitToolForbiddenException` as mixed-in subclasses.
CoreClient.grpc_exec early-exit branches `scalekit/core.py`	Imports `ScalekitTooManyRequestsException`. Adds a pre-check that immediately raises via `promote` when `TOOL_ERROR` is detected, bypassing retry/refresh. Adds an explicit `RESOURCE_EXHAUSTED` branch that also raises immediately.
SK-819 tests and version bump `tests/test_sk819_retry_behavior.py`, `scalekit/_version.py`	New test module with fake `grpc.RpcError` builders and a `CoreClient` bypass helper. Phase 1 asserts M2M refresh for Scalekit `UNAUTHENTICATED` and immediate raises for provider errors. Phase 2 validates exception hierarchy, metadata parsing, and no-retry behavior. Version bumped to `2.13.0`.

Connected Account Upsert Behavior

Layer / File(s)	Summary
Upsert path and alias in ActionClient `scalekit/actions/actions.py`	In the existing-account branch of `get_or_create_connected_account`, calls `update_connected_account` when `authorization_details` is present and returns a `CreateConnectedAccountResponse` wrapping the updated result. Adds `upsert_connected_account` as a public alias.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant CoreClient
    participant ScalekitServerException
    participant ToolExceptions

    Caller->>CoreClient: grpc_exec(fn, *args)
    CoreClient->>CoreClient: fn(*args) raises grpc.RpcError
    CoreClient->>ScalekitServerException: _extract_error_code(exp)
    alt error_code == "TOOL_ERROR"
        ScalekitServerException-->>CoreClient: "TOOL_ERROR"
        CoreClient->>ScalekitServerException: promote(exp)
        ScalekitServerException->>ToolExceptions: ScalekitToolRateLimitException / ScalekitToolUnauthorizedException / ScalekitToolForbiddenException
        ToolExceptions-->>Caller: raises immediately (no retry)
    else StatusCode.RESOURCE_EXHAUSTED
        CoreClient->>ScalekitServerException: promote(exp)
        ScalekitServerException-->>Caller: raises ScalekitTooManyRequestsException immediately
    else StatusCode.UNAUTHENTICATED (no TOOL_ERROR)
        CoreClient->>CoreClient: __authenticate_client() refresh
        CoreClient->>CoreClient: grpc_exec(fn, *args, retry-1)
        CoreClient-->>Caller: result
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

scalekit-inc/scalekit-sdk-python#165: Extends connected-account request/response models with authorization_details.trusted_idp, directly supporting the update_connected_account call added in this PR's upsert path.

Suggested reviewers

AkshayParihar33
Avinash-Kamath

Poem

🐇 Hop, hop — the errors now have names,
No more retries for provider-blamed flames!
TOOL_ERROR lands with grace and style,
Rate limits stop after just one trial.
Upsert accounts with a single bound —
v2.13.0, the best SDK around! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.78% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly references the two main issues (SK-819 provider error differentiation/retry fix and SK-821 credential upsert) and concisely summarizes the key changes: error handling, retry behavior, and credentials upsert functionality.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/bug/sk-819-provider-error-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

…when account exists When an account already exists, the "get" path previously returned it as-is and silently dropped any authorization_details provided by the caller. Fix: if authorization_details is provided and the account already exists, call update_connected_account to apply the credentials regardless of the account's current status (PENDING_AUTH, ACTIVE, EXPIRED, DISCONNECTED). Also adds upsert_connected_account as an alias on ActionClient — identical signature, preferred name going forward. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

New ScalekitTool* exception hierarchy, upsert_connected_account alias, and changed grpc_exec retry behavior for provider errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

scalekit/common/exceptions.py (1)
106-121: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low value

Avoid silently swallowing parse failures in _extract_error_code.

The blind except Exception: pass means any failure while reading the trailing metadata makes this return None, which causes grpc_exec to fall through to its retry/refresh path — i.e. the precise behavior SK-819 is trying to eliminate. At minimum log the swallowed exception so a parsing regression is debuggable. The local from grpc_status import rpc_status is also redundant since rpc_status is already used at module scope (Line 95).
♻️ Suggested change
     `@staticmethod`
     def _extract_error_code(error: grpc.RpcError) -> str | None:
         """ Extract error_code from gRPC trailing metadata without constructing a full exception """
-        from grpc_status import rpc_status
         try:
             status = rpc_status.from_call(error)
             if status is None:
                 return None
             for detail in status.details:
                 info = ErrorInfo()
                 detail.Unpack(info)
                 if info.error_code:
                     return info.error_code
-        except Exception:
-            pass
+        except Exception:
+            logger.debug("Failed to extract error_code from gRPC trailing metadata", exc_info=True)
         return None
As per static analysis hints (Ruff S110 try-except-pass and BLE001 blind except).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/common/exceptions.py` around lines 106 - 121, Replace the bare
`except Exception: pass` block in the `_extract_error_code` method with proper
exception handling that logs the swallowed exception to aid debugging of parsing
regressions. Additionally, remove the local `from grpc_status import rpc_status`
import statement inside the method since `rpc_status` is already imported at the
module scope and can be reused directly.
Source: Linters/SAST tools
scalekit/core.py (1)
166-170: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Early-exit routing is correct; consider raise ... from exp for cleaner tracebacks.

The TOOL_ERROR pre-check and explicit RESOURCE_EXHAUSTED branch correctly bypass retry/refresh. Minor: re-raising inside the except block without from chains an implicit __context__; Ruff flags this on Lines 169 and 181.
♻️ Suggested change
             if error_code == "TOOL_ERROR":
-                raise ScalekitServerException.promote(exp)
+                raise ScalekitServerException.promote(exp) from exp
...
             elif exp.code() == grpc.StatusCode.RESOURCE_EXHAUSTED:
                 # Surface Scalekit rate-limits immediately — retrying triples the damage
-                raise ScalekitServerException.promote(exp)
+                raise ScalekitServerException.promote(exp) from exp
As per static analysis hints (Ruff B904).

Also applies to: 179-181
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scalekit/core.py` around lines 166 - 170, When re-raising exceptions in the
except block, use explicit exception chaining with the `from` keyword to provide
clearer tracebacks. In the TOOL_ERROR handling block where
ScalekitServerException.promote(exp) is raised, change the raise statement to
use `raise ScalekitServerException.promote(exp) from exp` instead of the current
implicit chaining. Apply the same fix to the similar exception re-raise pattern
at lines 179-181 as indicated by the review comment. This makes the exception
chain explicit and follows Ruff B904 guidelines.
Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scalekit/actions/actions.py`:
- Around line 794-809: The docstring for the api_config parameter (around line
775) currently states it is "only used when creating", but the new upsert logic
now forwards api_config to update_connected_account when authorization_details
are supplied, meaning it also applies to existing accounts. Update the
api_config parameter docstring to clarify that it is used for both creating new
connected accounts and updating existing ones through the upsert mechanism, so
callers understand that providing api_config could modify an existing account's
configuration.
- Around line 794-809: The docstring for the `api_config` parameter (around line
775) currently states it is "only used when creating," but the upsert path now
passes `api_config` to the `update_connected_account` method when
`authorization_details` are supplied. Update the docstring to clarify that
`api_config` is used both when creating and when updating connected accounts in
the upsert scenario.

---

Nitpick comments:
In `@scalekit/common/exceptions.py`:
- Around line 106-121: Replace the bare `except Exception: pass` block in the
`_extract_error_code` method with proper exception handling that logs the
swallowed exception to aid debugging of parsing regressions. Additionally,
remove the local `from grpc_status import rpc_status` import statement inside
the method since `rpc_status` is already imported at the module scope and can be
reused directly.

In `@scalekit/core.py`:
- Around line 166-170: When re-raising exceptions in the except block, use
explicit exception chaining with the `from` keyword to provide clearer
tracebacks. In the TOOL_ERROR handling block where
ScalekitServerException.promote(exp) is raised, change the raise statement to
use `raise ScalekitServerException.promote(exp) from exp` instead of the current
implicit chaining. Apply the same fix to the similar exception re-raise pattern
at lines 179-181 as indicated by the review comment. This makes the exception
chain explicit and follows Ruff B904 guidelines.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cc52783d-a7b7-46ce-8d64-cdbd423393d9

📥 Commits

Reviewing files that changed from the base of the PR and between 6945518 and 97803cf.

📒 Files selected for processing (5)

scalekit/_version.py
scalekit/actions/actions.py
scalekit/common/exceptions.py
scalekit/core.py
tests/test_sk819_retry_behavior.py

coderabbitai · 2026-06-23T13:35:15Z

+
+            # True upsert: if credentials were supplied, apply them regardless of
+            # the account's current status (PENDING_AUTH, ACTIVE, EXPIRED, DISCONNECTED).
+            if authorization_details:
+                update_response = self.update_connected_account(
+                    connection_name=connection_name,
+                    identifier=identifier,
+                    authorization_details=authorization_details,
+                    organization_id=organization_id,
+                    user_id=user_id,
+                    api_config=api_config
+                )
+                return CreateConnectedAccountResponse(connected_account=update_response.connected_account)
+
            return CreateConnectedAccountResponse(connected_account=existing_response.connected_account)
-            
+


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Update the api_config docstring — it is no longer "only used when creating".

The new upsert branch forwards api_config to update_connected_account (Line 804), so api_config now also applies to existing accounts. The parameter doc at Line 775 (only used when creating) is now misleading and could cause callers to unknowingly mutate an existing account's API config.

📝 Suggested docstring update (Line 775)

- :param api_config: Optional API configuration for the connected account (optional, only used when creating) + :param api_config: Optional API configuration for the connected account. Applied when + creating a new account, and also when updating an existing account if + ``authorization_details`` are supplied. (optional)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scalekit/actions/actions.py` around lines 794 - 809, The docstring for the api_config parameter (around line 775) currently states it is "only used when creating", but the new upsert logic now forwards api_config to update_connected_account when authorization_details are supplied, meaning it also applies to existing accounts. Update the api_config parameter docstring to clarify that it is used for both creating new connected accounts and updating existing ones through the upsert mechanism, so callers understand that providing api_config could modify an existing account's configuration.

🩺 Stability & Availability | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inspect the three response models' constructors and connected_account fields. fd -t f -e py | xargs rg -nP -C3 '\bclass\s+(CreateConnectedAccountResponse|GetConnectedAccountAuthResponse|UpdateConnectedAccountResponse)\b' echo '--- connected_account references in these classes ---' fd -t f -e py | xargs rg -nP -C2 '\bconnected_account\b'

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 50390

🏁 Script executed:

sed -n '770,810p' scalekit/actions/actions.py

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 1989

🏁 Script executed:

# Check if update_connected_account accepts api_config parameter rg -A 15 'def update_connected_account' scalekit/actions/actions.py | head -25

Repository: scalekit-inc/scalekit-sdk-python

Length of output: 727

Update docstring to reflect api_config usage in upsert path.

The docstring at line 775 states that api_config is "only used when creating," but the upsert branch (lines 801–806) now passes api_config to update_connected_account when authorization_details are supplied. Update the docstring to clarify that api_config is used when creating or updating.

The response object contracts are sound: CreateConnectedAccountResponse, GetConnectedAccountAuthResponse, and UpdateConnectedAccountResponse all expose a .connected_account attribute of type ConnectedAccount, and all constructors accept it as a keyword argument.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scalekit/actions/actions.py` around lines 794 - 809, The docstring for the `api_config` parameter (around line 775) currently states it is "only used when creating," but the upsert path now passes `api_config` to the `update_connected_account` method when `authorization_details` are supplied. Update the docstring to clarify that `api_config` is used both when creating and when updating connected accounts in the upsert scenario.

ravibits changed the title ~~fix(SK-819): add ScalekitToolException hierarchy and fix blind retries for provider errors~~ fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials Jun 23, 2026

chore: bump version to 2.13.0

97803cf

New ScalekitTool* exception hierarchy, upsert_connected_account alias, and changed grpc_exec retry behavior for provider errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ravibits marked this pull request as ready for review June 23, 2026 13:27

ravibits requested a review from Avinash-Kamath as a code owner June 23, 2026 13:27

coderabbitai Bot requested changes Jun 23, 2026

View reviewed changes

Avinash-Kamath approved these changes Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials#174

fix(SK-819, SK-821): provider error differentiation, blind retry fix, and upsert credentials#174
ravibits wants to merge 3 commits into
mainfrom
fix/bug/sk-819-provider-error-handling

ravibits commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ravibits commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Backward compatibility

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ravibits commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading