Skip to content

fix(clickhouse): Task processor startup queries ClickHouse migrations on every boot#7863

Merged
khvn26 merged 1 commit into
mainfrom
fix/task-processor-clickhouse-migration-wait
Jun 24, 2026
Merged

fix(clickhouse): Task processor startup queries ClickHouse migrations on every boot#7863
khvn26 merged 1 commit into
mainfrom
fix/task-processor-clickhouse-migration-wait

Conversation

@khvn26

@khvn26 khvn26 commented Jun 24, 2026

Copy link
Copy Markdown
Member

Thanks for submitting a PR! Please check the boxes below:

  • I have read the Contributing Guide.
  • I have added information to docs/ if required so people know about the feature.
  • I have filled in the "Changes" section below.
  • I have filled in the "How did you test this code" section below.

Changes

Closes https://flagsmith.sentry.io/issues/7508759326/

The task processor entrypoint gated startup on waitfordb --waitfor 30 --migrations --database clickhouse. Constructing a MigrationExecutor against the ClickHouse connection reads the applied-migrations state, which forces a query against ClickHouse on every startup. On CH Cloud this wakes an idle data warehouse, and while ALTERs were in flight it intermittently failed with Code: 517 (replica metadata version lag), surfacing as the Sentry error above and an overly aggressive task-processor startup gate.

The check is redundant: the entrypoint already runs migrate --database clickhouse before run_task_processor, so ClickHouse migrations are applied before the task processor starts. This PR removes the ClickHouse migration wait from task-processor startup while keeping the Postgres, analytics, and task_processor migration waits intact.

Also drops the equivalent waitfordb --database clickhouse from the wait-for-db Make target used by the test setup.

How did you test this code?

Manually traced the failing path from the Sentry event's sys.argv (waitfordb --migrations --database clickhouse) through waitfordb.py and confirmed the ClickHouse migrations are already applied earlier in the entrypoint via migrate_clickhouse_db. Verified the remaining Postgres/analytics/task_processor waits are unchanged.

… on every boot

The task processor entrypoint gated startup on a waitfordb migration check
against the ClickHouse connection, forcing a query that wakes an idle CH Cloud
DWH and intermittently failed with Code: 517 (replica metadata lag). The check
is redundant since migrations are already applied earlier in the entrypoint.

beep boop
@khvn26 khvn26 requested a review from a team as a code owner June 24, 2026 12:43
@khvn26 khvn26 requested review from emyller and removed request for a team June 24, 2026 12:43
@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
docs Ignored Ignored Preview Jun 24, 2026 12:43pm
flagsmith-frontend-preview Ignored Ignored Preview Jun 24, 2026 12:43pm
flagsmith-frontend-staging Ignored Ignored Preview Jun 24, 2026 12:43pm

Request Review

@github-actions github-actions Bot added the api Issue related to the REST API label Jun 24, 2026
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Docker builds report

Image Build Status Security report
ghcr.io/flagsmith/flagsmith-e2e:pr-7863 Finished ✅ Skipped
ghcr.io/flagsmith/flagsmith-frontend:pr-7863 Finished ✅ Results
ghcr.io/flagsmith/flagsmith-api-test:pr-7863 Finished ✅ Skipped
ghcr.io/flagsmith/flagsmith-api:pr-7863 Finished ✅ Results
ghcr.io/flagsmith/flagsmith:pr-7863 Finished ✅ Results
ghcr.io/flagsmith/flagsmith-private-cloud:pr-7863 Finished ✅ Results

@github-actions github-actions Bot added the fix label Jun 24, 2026
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.60%. Comparing base (4203446) to head (21d2dad).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7863   +/-   ##
=======================================
  Coverage   98.60%   98.60%           
=======================================
  Files        1473     1473           
  Lines       57688    57688           
=======================================
  Hits        56882    56882           
  Misses        806      806           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Playwright Test Results (oss - depot-ubuntu-latest-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  35.6 seconds
commit  21d2dad
info  🔄 Run: #17778 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  45.5 seconds
commit  21d2dad
info  🔄 Run: #17778 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

passed  4 passed

Details

stats  4 tests across 4 suites
duration  32.5 seconds
commit  21d2dad
info  🔄 Run: #17778 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

passed  1 passed

Details

stats  1 test across 1 suite
duration  1 minute, 10 seconds
commit  21d2dad
info  🔄 Run: #17778 (attempt 1)

@github-actions

Copy link
Copy Markdown
Contributor

Visual Regression

19 screenshots compared. See report for details.
View full report

@khvn26 khvn26 merged commit 5b2f0cc into main Jun 24, 2026
33 checks passed
@khvn26 khvn26 deleted the fix/task-processor-clickhouse-migration-wait branch June 24, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Issue related to the REST API fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants