Skip to content

feat: repo materialization worker (disabled for now)#5555

Open
mchalapuk wants to merge 17 commits into
mainfrom
feat/git-materialization-worker
Open

feat: repo materialization worker (disabled for now)#5555
mchalapuk wants to merge 17 commits into
mainfrom
feat/git-materialization-worker

Conversation

@mchalapuk

@mchalapuk mchalapuk commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Adds the git-backed canvas materialization engine (pkg/canvas/materialize) and the RepositoryMaterializer worker that consumes repository-branch-updated messages and projects git branch tips into workflow_versions.

Details

  • Wires the worker into the server behind START_REPOSITORY_MATERIALIZER and bumps supergit to 0.1.2 (branch/merge support).
  • Ships disabled: the flag is "no" in the dev compose and unset in prod/release configs, so it can be enabled later without code changes.
  • No schema changes — all materialization state lives on the existing workflow_versions table.

Adds the git-backed canvas materialization engine (pkg/canvas/materialize)
and the RepositoryMaterializer worker that consumes repository-branch-updated
messages and projects git branch tips into the database. Wires the worker into
the server behind START_REPOSITORY_MATERIALIZER and bumps supergit to 0.1.2 in
the dev compose for branch/merge support.

The worker is disabled for now: START_REPOSITORY_MATERIALIZER is "no" in the dev
compose and is left unset in the release/prod configs (the server only starts it
when the flag is "yes"), so it can be enabled later without code changes.

All materialization state (commit SHA, branch, status, error) is tracked on the
existing workflow_versions table, so no additional table is required. Model
status strings are mapped to the typed MaterializationStatus proto enum at the
message boundary.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
@superplanehq-integration

Copy link
Copy Markdown

👋 Commands for maintainers:

  • /sp start - Start an ephemeral machine (takes ~30s)
  • /sp stop - Stop a running machine (auto-executed on pr close)

@mchalapuk mchalapuk marked this pull request as draft June 17, 2026 20:07
@mchalapuk mchalapuk self-assigned this Jun 17, 2026
Move git provider reads (ListBranches, Head, LoadRepoSnapshot) out of the
database transactions in the materialization engine so no git RPC is held
across a pooled DB connection. The sync functions now load git state first and
open their own short transaction for DB writes only; MaterializeLive and
MaterializeDraft take a pre-loaded snapshot. The live publisher's node Setup()
work stays inside the transaction by design (atomic with the node rows).

Also wire fully async draft-branch deletion: the worker now reconciles the DB
projection when it receives a DELETED notification.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>

@mchalapuk mchalapuk left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs work

Comment thread pkg/canvas/gitrepo/backfill.go Outdated
Comment thread pkg/canvas/materialize/draft.go Outdated
Comment thread pkg/canvas/materialize/live.go Outdated
Comment thread pkg/canvas/materialize/live_materializer.go
Comment thread pkg/canvas/materialize/materialize.go Outdated
Comment thread pkg/canvas/materialize/seed_yaml.go Outdated
Comment thread pkg/canvas/materialize/seed_yaml.go Outdated
Comment thread pkg/canvas/materialize/seed_yaml.go Outdated
Comment thread pkg/canvas/materialize/sync_draft_branch.go Outdated
Comment thread pkg/canvas/materialize/request.go Outdated
mchalapuk and others added 6 commits June 18, 2026 02:07
Move the canvas spec file names and draft branch naming/lookup helpers
(IsDraftBranch, DefaultDraftBranchName, OwnerFromDraftBranchName,
UniqueDraftBranchName, GitBranchExists) out of the materialize engine into
a new leaf package pkg/canvas/gitref. The engine and the git-repo
seeding/backfill code can both depend on it without a cycle. Display-name
helpers stay in the engine since only it uses them.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Move seed_repository.go, seed_yaml.go and backfill.go out of the
materialization engine into a new pkg/canvas/gitrepo package, which owns
building the git representation of a canvas (YAML encoders, initial repo
seeding, and pre-git-first backfill). Rename the YAML builders to the
*ToBytes convention (CanvasYAMLToBytes, ConsoleYAMLToBytes,
EmptyConsoleYAMLToBytes, ConsoleYAMLFromVersionToBytes) so callers read
clearly as "encode to bytes".

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Move RequestBranchMaterialization and the test-only in-process
materializer toggle (SetInProcessMaterializer/inProcessMaterializer) out
of the materialize engine and into pkg/workers, next to the consumer that
runs the work. The engine keeps only the BranchMaterializer core (renamed
file to branch_materializer.go). The producer publishes the pending
status enum directly, so no status mapping needs to be exported.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
…-only

Delete the unused Materializer/MaterializeFromGit/Mode indirection; the
BranchMaterializer is the single entry point. MaterializeDraft now only
handles draft branches (it is only ever called for drafts): drop the
dead non-draft commit-SHA lookup branch and the always-true draft guards,
and reject non-draft branches defensively.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
…n head

SyncLiveFromGit now always reads the current main HEAD and, when given an
explicit notification SHA that no longer matches it, skips as a no-op
instead of projecting a superseded commit. The newer commit that is now
main's HEAD carries its own notification and materializes current state.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Trim the engine's public surface to BranchMaterializer (the only type the
worker needs). Unexport the sync orchestrators, live/draft writers,
snapshot loader and type, reconcile helpers, and their options structs.
The snapshot-loader test becomes an internal package test so it can still
exercise the loader directly.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Add unit tests for the git-first materialization engine: live/draft branch
writers, stale-HEAD live skip, draft-deletion reconciliation, and the
canvas repository backfill (main + draft spec files).

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
@mchalapuk mchalapuk force-pushed the feat/git-materialization-worker branch from 3de8310 to 70f171a Compare June 18, 2026 01:25
mchalapuk and others added 2 commits June 18, 2026 04:00
Switch the draft-deletion reconciler and the repository backfill to read
git_branch (the canonical branch field) instead of branch_name. Writers
still populate branch_name because the draft CHECK constraint and the model
lookup/upsert helpers depend on it until the column-drop migration lands in
a separate PR. Emulate that migration's git_branch backfill in the backfill
test fixture.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the owner-encoded draft branch scheme (DefaultDraftBranchName /
OwnerFromDraftBranchName / UniqueDraftBranchName) with a single
NewDraftBranchName helper that returns drafts/<random-uuid>. Draft branches
no longer encode ownership; the materializer now records the pusher as the
draft owner. This also matches how the model already names draft branches and
fixes OwnerFromDraftBranchName misreading the random branch uuid as a user id.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
@mchalapuk mchalapuk changed the title feat: add repository materialization worker (disabled by default) feat: repo materialization worker (disabled by default) Jun 18, 2026
mchalapuk and others added 2 commits June 18, 2026 21:05
Split the materialization engine into per-branch files
(live_materializer, draft_materializer, draft_deleter) that own their
own transactions and idempotency checks, route error/deletion signals
through canvas_version_updated, gate the worker to actionable requests,
and fold the gitref vocabulary (spec file names, draft branch prefix,
IsDraftBranch) into the models package.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
Drop seed/backfill helpers that have no production callers yet and
inline minimal git fixture setup in materialize tests instead.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
@mchalapuk mchalapuk force-pushed the feat/git-materialization-worker branch from 4104cd7 to c68ecec Compare June 18, 2026 20:26
mchalapuk and others added 3 commits June 18, 2026 22:27
Draft version rows now rely on git_branch only when materializing
drafts and persisting materialization errors.

Signed-off-by: Maciej Chałapuk <maciej@chalapuk.pl>
Co-authored-by: Cursor <cursoragent@cursor.com>
@mchalapuk mchalapuk marked this pull request as ready for review June 18, 2026 20:53
@mchalapuk mchalapuk requested review from lucaspin and shiroyasha June 18, 2026 20:54
@mchalapuk mchalapuk changed the title feat: repo materialization worker (disabled by default) feat: repo materialization worker (disabled for now) Jun 18, 2026
@superplane-gh-integration-9000

Copy link
Copy Markdown

PR Risk Review

Risk: 42/100 (medium)
Review approved: Yes
Check passed: Yes

Summary

New repository materialization worker feature that is explicitly disabled (START_REPOSITORY_MATERIALIZER=no) with proper advisory locking, idempotency guards, comprehensive tests, and a clean separation of concerns.

Concerns

  • The worker is disabled by default, but the in-process materializer path (SetInProcessMaterializer) uses a package-level global with mutex, which could be error-prone if test parallelism is not carefully managed.
  • The semaphore in the worker is hardcoded to 25 concurrent materializations with no configuration knob; under high load this could either be too aggressive or insufficient.
  • Error handling in sweepDeletedDraftBranchesFromGit is best-effort (logged but swallowed); if branch deletion reconciliation consistently fails it could leave stale projections indefinitely without alerting.
  • The advisory lock key uses FNV-64a hash which has a small collision probability across all (canvasID, branch) pairs; unlikely but worth documenting.
  • context.Background() is used inside the consumer handler rather than propagating a cancellable context from the worker lifecycle, which could delay graceful shutdown.

Recommended reviewers: superplanehq/backend-team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant