fix(clickhouse): Task processor startup queries ClickHouse migrations on every boot#7863
Merged
Merged
Conversation
… on every boot The task processor entrypoint gated startup on a waitfordb migration check against the ClickHouse connection, forcing a query that wakes an idle CH Cloud DWH and intermittently failed with Code: 517 (replica metadata lag). The check is redundant since migrations are already applied earlier in the entrypoint. beep boop
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
Contributor
Docker builds report
|
gagantrivedi
approved these changes
Jun 24, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7863 +/- ##
=======================================
Coverage 98.60% 98.60%
=======================================
Files 1473 1473
Lines 57688 57688
=======================================
Hits 56882 56882
Misses 806 806 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Contributor
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)Details
|
Contributor
Visual Regression19 screenshots compared. See report for details. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks for submitting a PR! Please check the boxes below:
docs/if required so people know about the feature.Changes
Closes https://flagsmith.sentry.io/issues/7508759326/
The task processor entrypoint gated startup on
waitfordb --waitfor 30 --migrations --database clickhouse. Constructing aMigrationExecutoragainst the ClickHouse connection reads the applied-migrations state, which forces a query against ClickHouse on every startup. On CH Cloud this wakes an idle data warehouse, and while ALTERs were in flight it intermittently failed withCode: 517(replica metadata version lag), surfacing as the Sentry error above and an overly aggressive task-processor startup gate.The check is redundant: the entrypoint already runs
migrate --database clickhousebeforerun_task_processor, so ClickHouse migrations are applied before the task processor starts. This PR removes the ClickHouse migration wait from task-processor startup while keeping the Postgres, analytics, and task_processor migration waits intact.Also drops the equivalent
waitfordb --database clickhousefrom thewait-for-dbMake target used by the test setup.How did you test this code?
Manually traced the failing path from the Sentry event's
sys.argv(waitfordb --migrations --database clickhouse) throughwaitfordb.pyand confirmed the ClickHouse migrations are already applied earlier in the entrypoint viamigrate_clickhouse_db. Verified the remaining Postgres/analytics/task_processor waits are unchanged.