feat(Segment Membership): Seed identities on Beta opt-in by khvn26 · Pull Request #7899 · Flagsmith/flagsmith

khvn26 · 2026-06-27T15:41:44Z

Thanks for submitting a PR! Please check the boxes below:

I have read the Contributing Guide.
I have added information to docs/ if required so people know about the feature.
I have filled in the "Changes" section below.
I have filled in the "How did you test this code" section below.

Changes

Closes #7471.

In this PR, we convert backfill_identities_to_clickhouse to a seed backfill expected to run once per org.

seed_organisation_identities(org_id) — on-demand one-shot seed for one org. Stamps inserted_at at scan start so a CDC write landing mid-scan wins ReplacingMergeTree dedup.
reconcile_segment_membership_seeds (hourly) — fires the seed once per allowed org, tracked by SegmentMembershipSeed model.
refresh_all_segment_counts every 6 hours.
Django admin action to force a re-seed (clears SegmentMembershipSeed for a given org).

All tasks that run the expensive ClickHouse queries are debounced.

Review complexity: 2/5.

How did you test this code?

Added new tests and modified existing where needed; will test extensively in staging.

beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

vercel · 2026-06-27T15:41:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
docs	Ignored	Preview	Jun 29, 2026 4:34pm
flagsmith-frontend-preview	Ignored	Preview	Jun 29, 2026 4:34pm
flagsmith-frontend-staging	Ignored	Preview	Jun 29, 2026 4:34pm

Run seed tests against live ClickHouse with a mocked Dynamo source, inline the pending-task check, assert the whole skip event, use the ClickHouseIdentityRow type, and replace the management command with a Django admin re-seed action that clears the seed marker. beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

for more information, see https://pre-commit.ci

beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

…t-in Replace the daily all-org backfill with an on-demand seed_organisation_identities(organisation_id) task and a lightweight reconcile_segment_membership_seeds tick that fires it once per opted-in org, tracked by a SegmentMembershipSeed marker. Seeded rows are versioned at scan start so a CDC write landing mid-scan wins ReplacingMergeTree dedup. Keep a count-refresh safety-net as a separate recurring refresh_all_segment_counts task (cadence via SEGMENT_MEMBERSHIP_REFRESH_INTERVAL_HOURS, default 6) so cached counts track CDC identity churn between segment edits. All refresh enqueues - edit-triggered, seed fan-out, and the recurring sweep - route through enqueue_membership_refresh, the single flag and debounce gate, so a project never has duplicate refreshes queued. Add a Django admin action to force a re-seed by clearing the marker. beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

github-actions · 2026-06-29T03:31:08Z

Docker builds report

Image	Build Status	Security report
`ghcr.io/flagsmith/flagsmith-api-test:pr-7899`	Finished ✅	Skipped
`ghcr.io/flagsmith/flagsmith-e2e:pr-7899`	Finished ✅	Skipped
`ghcr.io/flagsmith/flagsmith-frontend:pr-7899`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith-api:pr-7899`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith:pr-7899`	Finished ✅	Results ✅
`ghcr.io/flagsmith/flagsmith-private-cloud:pr-7899`	Finished ✅	Results ✅

github-actions · 2026-06-29T10:39:58Z

Playwright Test Results (oss - depot-ubuntu-latest-16)

1 failed

Details

1 test across 1 suite
14.5 seconds
ed910aa
📦 Artifacts: View test results and HTML report
🔄 Run: #17916 (attempt 1)

Failed tests

firefox › tests/flag-tests.pw.ts › Flag Tests › Feature flags can be created, toggled, edited, and deleted across environments @oss

### Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

5 passed

Details

5 tests across 3 suites
36 seconds
ed910aa
🔄 Run: #17916 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

1 failed
1 passed

Details

2 tests across 2 suites
47.2 seconds
ed910aa
📦 Artifacts: View test results and HTML report
🔄 Run: #17916 (attempt 1)

Failed tests

firefox › tests/environment-permission-test.pw.ts › Environment Permission Tests › Environment-level permissions control access to features, identities, and segments @enterprise

### Playwright Test Results (oss - depot-ubuntu-latest-16)

4 passed

Details

4 tests across 3 suites
34.5 seconds
ed910aa
🔄 Run: #17916 (attempt 2)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

4 passed

Details

4 tests across 3 suites
40.4 seconds
ed910aa
🔄 Run: #17916 (attempt 2)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

5 passed

Details

5 tests across 4 suites
47.4 seconds
ed910aa
🔄 Run: #17916 (attempt 2)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

4 passed

Details

4 tests across 3 suites
11.2 seconds
ed910aa
🔄 Run: #17916 (attempt 2)

Playwright Test Results (oss - depot-ubuntu-latest-16)

4 passed

Details

4 tests across 3 suites
34.9 seconds
6c9bb4c
🔄 Run: #17917 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

3 passed

Details

3 tests across 2 suites
11.8 seconds
6c9bb4c
🔄 Run: #17917 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

4 passed

Details

4 tests across 4 suites
11.9 seconds
6c9bb4c
🔄 Run: #17917 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

2 passed

Details

2 tests across 2 suites
46.4 seconds
6c9bb4c
🔄 Run: #17917 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-16)

5 passed

Details

5 tests across 4 suites
42.9 seconds
07fa649
🔄 Run: #17928 (attempt 1)

Playwright Test Results (oss - depot-ubuntu-latest-arm-16)

5 passed

Details

5 tests across 4 suites
46 seconds
07fa649
🔄 Run: #17928 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)

3 passed

Details

3 tests across 3 suites
59.7 seconds
07fa649
🔄 Run: #17928 (attempt 1)

Playwright Test Results (private-cloud - depot-ubuntu-latest-16)

6 passed

Details

6 tests across 5 suites
11.9 seconds
07fa649
🔄 Run: #17928 (attempt 1)

github-actions · 2026-06-29T10:42:06Z

Visual Regression

17 screenshots compared. See report for details.
View full report

…t touch Prometheus Admin modules load during Django's admin autodiscovery on every manage.py startup. Importing tasks (→ metrics) at module top instantiated Prometheus collectors at import time, writing to PROMETHEUS_MULTIPROC_DIR in a context without write access and crashing container boot. Import the task lazily in the action, matching the pattern in services.py. beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

…ATION_LOGGERS The seed, reconcile and refresh tasks log under the `segment_membership` logger, which wasn't in the default APPLICATION_LOGGERS list, so their structlog events were suppressed in deployed environments. beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

emyller

Looks good overall. Just a few topics to cover.

emyller · 2026-06-29T18:29:30Z

+    log = logger.bind(organisation__id=organisation_id)
    if not settings.CLICKHOUSE_ENABLED:
-        logger.info("backfill.skipped", reason="clickhouse_not_configured")
+        log.info("seed.skipped", reason="clickhouse_not_configured")


I think log.info might not be the best reaction here. Given seed_organisation_identities is only called when it's intended, being unable to continue should raise, or at least log.warning if we need it not to raise.

emyller · 2026-06-29T18:34:55Z

+        return
+
+    organisation = Organisation.objects.get(pk=organisation_id)
+    if not is_membership_enabled(organisation):


It seems the only path calling this task already owns this responsibility and has organisations filtered by this flag. I also believe this feature check doesn't belong here, but in the caller, unless the live state — i.e. when the task runs — is relevant.

emyller · 2026-06-29T18:56:15Z

+    project_ids = Segment.live_objects.filter(
+        project__organisation=organisation
+    ).values_list("project_id", flat=True)
+    for project in Project.objects.filter(id__in=project_ids).iterator():


I sense there's a bit of unnecessary stretch here in the data structure:

In filtering organisation on the join [with project] level

In using a server-side cursor for a likely small queryset

Suggested change

project_ids = Segment.live_objects.filter(

project__organisation=organisation

).values_list("project_id", flat=True)

for project in Project.objects.filter(id__in=project_ids).iterator():

projects_with_live_segments = Project.objects.filter(

organisation=organisation,

).filter(Exists(Segment.live_objects.filter(project=OuterRef("pk"))))

for project in projects_with_live_segments:

emyller · 2026-06-29T18:58:00Z

                        project__id=project.id,
                        environment__id=env.id,
                    )
                    continue


I appreciate this may be out of scope, but I'd be interested in learning what went wrong here when seeding fails.

emyller · 2026-06-29T19:47:04Z

+    run_every=timedelta(hours=1),
+    timeout=timedelta(minutes=5),
+)
+def reconcile_segment_membership_seeds() -> None:


I understand why this exists, but I think more and more the organisations catered for here will be fewer, until we EOL the feature flag and it reaches zero.

Please update this with a TODO for future cleanup if you agree.

After reading on and learning about refresh_all_segment_counts, I wonder if that task could also include opted_in - seeded organisations?

I think I'd prefer a temporary branch in a permanent task, rather than this temporary task, unless we want to collect any specific data?

emyller · 2026-06-29T19:50:46Z

+    """Refresh counts for every project with a live segment on a slow cadence so
+    cached counts track identities ingested via CDC between segment edits.
+    `enqueue_membership_refresh` is the single flag + debounce gate.
+    """


Suggested change

"""Refresh counts for every project with a live segment on a slow cadence so

cached counts track identities ingested via CDC between segment edits.

`enqueue_membership_refresh` is the single flag + debounce gate.

"""

"""Refresh counts for every project with a live segment"""

test(Segment Membership): Add red tests for opt-in seed and reconciler

25d229c

beep boop Claude-Session: https://claude.ai/code/session_01EgZ5iHpDASZzCapiHRxHLB

github-actions Bot added api Issue related to the REST API feature New feature or request labels Jun 27, 2026

This comment was marked as outdated.

Sign in to view

khvn26 and others added 2 commits June 27, 2026 17:55

[pre-commit.ci] auto fixes from pre-commit.com hooks

a24e80d

for more information, see https://pre-commit.ci