Skip to content

callmeradical/smith

Repository files navigation

Smith

Smith is an etcd-backed, Kubernetes-native autonomous orchestration platform.

Purpose

Smith coordinates autonomous execution loops as a state machine stored in etcd. It is designed to:

  • accept operator ingress requests (direct, GitHub issues, PRD tasks),
  • convert each request into a deterministic loop lifecycle (unresolved -> overwriting -> synced|flatline|cancelled),
  • enforce safe concurrency with per-loop locks and revision-checked state transitions,
  • run loop workers as Kubernetes Jobs and preserve execution evidence (journal, handoff, override, audit).

In this repository, the focus is the MVP control plane, deployment assets, and verification/test harnesses.

Philosophy

Smith intentionally does not personify agents.

  • Agents are modeled as homogeneous and omnicapable execution units, not distinct personalities.
  • Anthropomorphizing agents is treated as an implementation constraint that reduces operational flexibility and performance.
  • The target model is uniform replication: many equivalent workers, same contract, same capabilities, horizontally scalable.
  • The system design favors role-neutral orchestration primitives (state, locks, jobs, handoffs) over persona-specific behavior.
  • The platform direction is informed by Ralph, marcus/sidecar, marcus/td, and related projects, but is engineered to scale beyond a single developer machine.

Smith also moves beyond a single-machine file-system model by using etcd + Kubernetes as the control substrate, so execution can scale across distributed compute while preserving deterministic state and traceability.

Architecture Summary

Smith is split into control-plane and data-plane components.

Control Plane

  • smith-api (cmd/smith-api): HTTP API for loop create/list/get, GitHub + PRD ingress, operator override actions, provider auth lifecycle, and cost reporting.
  • smith-core (cmd/smith-core): watches unresolved loop state in etcd, acquires per-loop locks, transitions loop state, and schedules replica Jobs in Kubernetes.
  • smithctl (cmd/smithctl): kubectl-style operator CLI for loop and prd resources with context/config support and scriptable JSON output.
  • smith (cmd/smith): PRD launcher CLI (smith --prd) for interactive PRD generation before build loops.
  • smith-console (console/ + Helm deployment): operator UI/runtime assets.
  • etcd: authoritative source of truth for anomalies, loop lifecycle state, locks, journal events, handoffs, overrides, and audit records.

Data Plane

  • smith-replica (cmd/smith-replica): Kubernetes Job worker that executes loop work, appends journal entries, writes handoff output, and finalizes loop state.

Deployment and Ops Assets

  • Helm chart: helm/smith
  • Dockerfiles: docker/
  • Core implementation: internal/source/
  • Supporting docs: docs/
  • Make-first local workflow: make help (doctor/bootstrap/cluster/deploy/test/teardown)

Key API Endpoints

Implemented today:

  • POST /v1/loops single/batch direct loop creation.
  • POST /v1/loops supports environment profiles (preset, mise, container_image, dockerfile) with server-side validation/defaulting.
  • POST /v1/ingress/github/issues ingest one or more GitHub issues into loop specs.
  • POST /v1/ingress/prd ingest markdown/json PRD inputs into loop specs.
  • GET /v1/loops/{id} and GET /v1/loops/{id}/journal for state and traceability.
  • GET /v1/loops/{id}/runtime to resolve namespace/pod/container attachability for console terminal control.
  • POST /v1/loops/{id}/control/attach, /command, and /detach for authenticated operator interactive terminal control.
  • POST /v1/control/override for operator state overrides with reason/audit trail.
  • POST /v1/auth/codex/connect/start|complete, GET /v1/auth/codex/status, and POST /v1/auth/codex/disconnect for provider auth lifecycle.
  • GET /v1/reporting/cost?loop_id={id} for loop token/cost aggregation from journal metadata.

Aspirational (planned, not implemented yet):

  • GET /v1/loops/{id}/handoffs, GET /v1/loops/{id}/overrides, and GET /v1/loops/{id}/trace for end-to-end execution evidence.
  • GET /v1/audit?loop_id={id} for immutable operator/auth action audit records.

Terminal control API contracts, required auth/RBAC permissions, and troubleshooting are documented in:

Local Git Hooks

Install repo-managed hooks:

make hooks-install

Hook behavior:

  • pre-commit: quick checks (go test ./cmd/...)
  • pre-push: full gate (make build + make test)

Temporarily bypass hooks if needed:

SKIP_GIT_HOOKS=1 git commit -m "..."
SKIP_GIT_HOOKS=1 git push

Frontend Playwright Tests

Install frontend test dependencies:

npm install

Run Playwright tests for the console UI:

npm run test:frontend
# or
make test-frontend

Artifacts are written under output/playwright/ (HTML report + failure artifacts).

Run tests against a deployed, port-forwarded console UI:

kubectl -n smith-system port-forward svc/smith-smith-console 3000:3000
npm run test:frontend:live

Technology Stack and Thanks

See the dedicated documentation page for:

  • the current technology stack (Kubernetes, Helm, vCluster, etcd, Go, Docker, and related tooling),
  • acknowledgments and inspiration credits, including marcus/td, marcus/sidecar, and Ralph.

Reference: docs/technology-stack-and-thanks.md

PRD-First Loop Workflow

Generate a PRD JSON (interactive agent session):

smith --prd "Build issue-driven loop execution with terminal attach support" --out .agents/tasks/prd.json

If a PRD already exists at .agents/tasks/prd.json, replica issue/prompt workflows skip PRD generation and move straight to iterative build.

About

Smith: distributed autonomous agent orchestration system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors