Skip to content

feat(approvals): human-in-the-loop tool-call confirmation (#124 Tier 1+2)#517

Open
clemenshelm wants to merge 9 commits into
mainfrom
feat/approval-confirm
Open

feat(approvals): human-in-the-loop tool-call confirmation (#124 Tier 1+2)#517
clemenshelm wants to merge 9 commits into
mainfrom
feat/approval-confirm

Conversation

@clemenshelm

Copy link
Copy Markdown
Contributor

What

Implements Tier 1+2 of #124 — human-in-the-loop tool-call confirmation. An admin marks tools as "require confirmation"; the agent then pauses and asks the acting user before running them. The user approves their own action in seconds — no admin bottleneck.

This follows the reframe in #124: the validated, market-standard need is confirmation (the acting user confirms, so the agent can't act autonomously), not an admin approval queue. Four-eyes (a different approver) stays documented in the issue as a demand-pulled extension and is not built here.

How

A two-layer, server-enforced design:

  • Gate — a new internal hook-only plugin pinchy-approvals registers a before_tool_call hook that asks Pinchy's gate-check endpoint for every tool. The endpoint is the security boundary; it fails closed if unreachable.
  • Durable ticket — Pinchy's DB owns the lifecycle. A grant is bound to (agentId, requesting user / senderId, args digest, originating session) and consumed exactly once (FOR UPDATE SKIP LOCKED). Changed args ⇒ new confirmation. Never shared across users of a shared agent. Pending confirmations expire (15 min) and fail closed.
  • Resolve — the acting user approves/denies their own request (self-confirm enforced server-side); the agent re-issues and the gate consumes the now-approved ticket.

Surfaces

  • Admin config: a "Require confirmation before" section in the agent permissions tab, with per-tool checkboxes and a "Use recommended" button (auto-selects the agent's powerful tools).
  • Approve/deny: an in-app ApprovalsInbox that lists the user's pending confirmations.
  • Audit: full approval.requested → granted/denied → consumed/expired lifecycle. PII-free — only the args digest is logged; the human-readable summary stays on the operational row.
  • Docs: a "Require Tool Confirmation" guide.

Tested

  • Unit: 5992 passing (digest, gate decision, policy, plugin gate logic, two RTL components).
  • Integration (real PostgreSQL): 133 passing — consume-once / fail-closed / concurrent-consume on real SQL; gate-check, list, and self-confirm decision routes with their audit lifecycle.
  • E2E: a gated tool is blocked → approval.requested → user grants → approval.granted (runs in the integration E2E CI job; satisfies the internal-plugin coverage contract).
  • pnpm build, tsc --noEmit, eslint (0 errors), and the docs build all pass. All Plugin-Integration-Contract drift guards updated.

Deferred (documented, not built)

Refs #124.

Clemens Helm added 9 commits June 16, 2026 16:39
Durable, server-enforced confirmation record for #124 Tier 1+2.
Bound to (agentId, requesterId, argsDigest, sessionKey), consumed once.
tier enum reserves 'escalate' for the deferred four-eyes tier.
- computeArgsDigest: stable canonical sha256 binding one exact tool call
- decideGate: consume-once (FOR UPDATE SKIP LOCKED) or create pending; fail-closed on expiry
- resolveDecision/expireStale; self-confirm authorization
- defaultConfirmTools: powerful-tool auto-default from tool-registry
- timestamptz columns; AgentPluginConfig.pinchy-approvals.confirmTools
Real-DB integration tested.
- POST /api/internal/approvals/gate-check (gateway-token): the server-side
  security boundary; consume-or-block + audits approval.requested/consumed
- GET /api/approvals: requester's pending confirmations for the chat card
- POST /api/approvals/[id]/decision: self-confirm approve/deny + audit
- approval.* audit family (PII-free: argsDigest only; summary stays operational)
- decideGate gains a created flag to keep approval.requested idempotent
Real-DB integration tested (132 integration, 5962 unit, tsc clean).
…Config

Lets the agents PATCH route persist the per-agent confirm-tool policy.
- New internal hook-only plugin: before_tool_call gate calls gate-check for
  every tool and fails closed if the approval service is unreachable
- gate-check now short-circuits ungated tools server-side (always-fresh policy,
  no plugin cache); plugin needs zero policy state
- Wiring: KNOWN/INTERNAL_PLUGINS, manifest map, regenerateOpenClawConfig
  emission (always enabled), Dockerfile.pinchy, entrypoint.sh expected list
- Drift guards updated (schema-sync classification, plugins.allow order)
- E2E: gated tool blocked → approval.requested → user grants → approval.granted
New 'Require confirmation before' section in the agent permissions tab
(admin-only): per-tool checkboxes + a 'Use recommended' button that selects
the agent's powerful tools. Persists to pluginConfig['pinchy-approvals'].
A polling ApprovalsInbox (mounted in the app shell) lists the acting user's
pending tool-call confirmations with Approve/Deny. After approving, the user
asks the agent to proceed and the gate consumes the ticket.

Deferred (follow-up): native requireApproval inline-in-chat card + auto
proceed-injection (needs an openclaw-node extension).
Covers enabling per-agent confirmation, the approve/deny flow, the audit
lifecycle, and the server-side enforcement guarantee. Pinchy voice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant