fix(deployment): Auto-initialize MongoDB replica set on fresh volumes#2270
fix(deployment): Auto-initialize MongoDB replica set on fresh volumes#2270goynam wants to merge 4 commits into
Conversation
…ox conflicts (fixes y-scope#2259) When running multiple compression/query worker pods, all workers registered with the same static Celery node name causing RabbitMQ pidbox queue conflicts and periodic crashes. Append $(HOSTNAME) (which Kubernetes sets to the pod name) to make each worker's node name unique. Also bumps Helm chart version to 0.3.2-dev.2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fixes y-scope#2257) When the results-cache (MongoDB) pod's PVC is deleted and recreated, MongoDB starts in RSGhost state because it has replication.replSetName configured but no replica set initialized. This causes all clients to fail with "No servers match selector Primary()". Add a postStart lifecycle hook that waits for mongod to accept connections, then calls rs.initiate() if the replica set has not already been initialized. The hook is idempotent: on subsequent restarts where the replica set is already configured, rs.status() succeeds and initiation is skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughHelm chart version bumped to 0.3.2-dev.3. Compression and query worker Deployments now source replica counts from ChangesHelm Deployment Configuration Updates
Sequence Diagram(s)sequenceDiagram
participant ResultsCacheContainer as results-cache container
participant Mongosh as mongosh
participant MongoDB as MongoDB (localhost:27017)
ResultsCacheContainer->>Mongosh: run db.runCommand('ping') loop (up to 30 attempts)
Mongosh-->>ResultsCacheContainer: ping succeeds
ResultsCacheContainer->>Mongosh: execute rs.status()
alt rs.status() throws
Mongosh->>MongoDB: rs.initiate({_id: "rs0", members: [{_id: 0, host: "localhost:27017"}]})
MongoDB-->>Mongosh: replica set initialized
else rs.status() OK
Mongosh-->>ResultsCacheContainer: replica set already configured
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/deployment/package-helm/templates/compression-worker-deployment.yaml (1)
30-91:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
$(HOSTNAME)will not be expanded — Celery receives the literal string.Kubernetes only substitutes
$(VAR_NAME)incommand/argsusing variables declared inspec.containers[].env.HOSTNAMEis set by the container runtime, not by Kubernetes, so it is not part of the substitution scope. Because the command invokespython3directly (array format, no shell), shell expansion does not happen either. The Celery worker will end up with a literal node namecompression-worker@$(HOSTNAME), which is the same constant for every pod and will not fix the pidbox conflict the PR aims to address.Declare a downward-API env var so Kubernetes can substitute it:
🔧 Proposed fix — inject pod name via the downward API
env: + - name: "HOSTNAME" + valueFrom: + fieldRef: + fieldPath: "metadata.name" - {{- include "clp.celeryBrokerUrlEnvVar" . | nindent 14 }}Alternatively, run the command via a shell:
command: [ - "python3", "-u", - "/opt/clp/lib/python3/site-packages/bin/celery", - "-A", "job_orchestration.executor.compress", - "worker", - "--concurrency", "{{ .Values.workerConcurrency }}", - "--loglevel", "WARNING", - "-Q", "compression", - "-n", "compression-worker@$(HOSTNAME)" - ] + "bash", "-c", + "exec python3 -u /opt/clp/lib/python3/site-packages/bin/celery -A job_orchestration.executor.compress worker --concurrency {{ .Values.workerConcurrency }} --loglevel WARNING -Q compression -n compression-worker@${HOSTNAME}" + ]Note: The same issue affects
query-worker-deployment.yaml(line 90).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml` around lines 30 - 91, The command currently uses a literal compression-worker@$(HOSTNAME) which Kubernetes will not expand because HOSTNAME isn't defined in spec.containers[].env and the array form bypasses a shell; add a downward API env var (e.g., name POD_NAME using fieldRef fieldPath: metadata.name) in the env block and replace the "-n", "compression-worker@$(HOSTNAME)" entry in the command array with the pod-name variable (compression-worker@$(POD_NAME)) so Kubernetes will substitute the actual pod name at runtime; apply the same change pattern to the query-worker deployment command as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`:
- Around line 43-61: The chart currently forces mongod into a replica set via
the postStart hook (see lifecycle.postStart exec invoking mongosh and
rs.initiate), which couples fresh-PVC recovery to a successful hook; change to
Option A by removing replication.replSetName and any replicaSet= URI param so
the StatefulSet's mongod runs standalone (no rs.initiate in
lifecycle.postStart), and instead document in the chart docs/values that for
multi-AZ clusters users should set volumeBindingMode: WaitForFirstConsumer or
use EFS to avoid PVC recreation across AZs; update
templates/results-cache-statefulset.yaml to drop replSet config and add the
storage/AZ guidance to the chart README/values comment.
- Around line 54-59: The rs.initiate call uses a hardcoded "localhost:27017"
which breaks stable member identity; change the mongosh --host invocation and
the members[0].host value in the rs.initiate payload to use the StatefulSet pod
FQDN (pod.service.namespace.svc.cluster.local with the :27017 port) injected via
Helm/downward API or template rendering instead of "localhost:27017", so the
rs.initiate and mongosh commands reference the stable pod DNS rather than
localhost.
- Around line 49-61: The postStart hook's unbounded wait and blanket suppression
of errors must be fixed: replace the infinite until loop around mongosh with a
bounded retry (e.g., loop with a max_attempts / timeout and exit non‑zero if not
reachable) so the container fails fast if mongod never becomes available, and
remove the global redirection/|| true on the rs.initiate() step so failures
surface; additionally change the JS catch around rs.status() to only handle
NotYetInitialized (error code 94) and explicitly print/initiate errors to stderr
(so kubernetes logs/events contain diagnostics) while ensuring rs.initiate() is
attempted only when code 94 is observed.
---
Outside diff comments:
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`:
- Around line 30-91: The command currently uses a literal
compression-worker@$(HOSTNAME) which Kubernetes will not expand because HOSTNAME
isn't defined in spec.containers[].env and the array form bypasses a shell; add
a downward API env var (e.g., name POD_NAME using fieldRef fieldPath:
metadata.name) in the env block and replace the "-n",
"compression-worker@$(HOSTNAME)" entry in the command array with the pod-name
variable (compression-worker@$(POD_NAME)) so Kubernetes will substitute the
actual pod name at runtime; apply the same change pattern to the query-worker
deployment command as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: be587f7a-838c-49d2-8eae-3b7ba813d557
📒 Files selected for processing (4)
tools/deployment/package-helm/Chart.yamltools/deployment/package-helm/templates/compression-worker-deployment.yamltools/deployment/package-helm/templates/query-worker-deployment.yamltools/deployment/package-helm/templates/results-cache-statefulset.yaml
| lifecycle: | ||
| postStart: | ||
| exec: | ||
| command: | ||
| - "bash" | ||
| - "-c" | ||
| - | | ||
| until mongosh --host localhost:27017 --eval "db.runCommand('ping')" \ | ||
| --quiet 2>/dev/null; do | ||
| sleep 1 | ||
| done | ||
| mongosh --host localhost:27017 --eval ' | ||
| try { rs.status(); } catch(e) { | ||
| rs.initiate({ | ||
| _id: "rs0", | ||
| members: [{_id: 0, host: "localhost:27017"}] | ||
| }); | ||
| } | ||
| ' 2>/dev/null || true |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift
Consider Option A (standalone mongod) per the linked issue, and document storage/AZ guidance.
Per the discussion in #2257, Option A — dropping replication.replSetName and any replicaSet= URI parameter so mongod runs as a standalone — was the recommended fix because nothing in the chart appears to require replication semantics on a single-replica StatefulSet. Option B (this PR) works, but it permanently couples every fresh-PVC scenario to a successful postStart hook. Worth confirming the trade-off is intentional; if there's no current consumer of replica-set features (oplog, change streams, multi-member reads), Option A removes a whole failure mode rather than papering over it.
Separately, the linked issue notes that AZ mismatches are what cause the PVC recreation in the first place. Adding a brief note to chart docs/values about volumeBindingMode: WaitForFirstConsumer (or recommending EFS for multi-AZ clusters) would prevent users from hitting this path repeatedly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`
around lines 43 - 61, The chart currently forces mongod into a replica set via
the postStart hook (see lifecycle.postStart exec invoking mongosh and
rs.initiate), which couples fresh-PVC recovery to a successful hook; change to
Option A by removing replication.replSetName and any replicaSet= URI param so
the StatefulSet's mongod runs standalone (no rs.initiate in
lifecycle.postStart), and instead document in the chart docs/values that for
multi-AZ clusters users should set volumeBindingMode: WaitForFirstConsumer or
use EFS to avoid PVC recreation across AZs; update
templates/results-cache-statefulset.yaml to drop replSet config and add the
storage/AZ guidance to the chart README/values comment.
| mongosh --host localhost:27017 --eval ' | ||
| try { rs.status(); } catch(e) { | ||
| rs.initiate({ | ||
| _id: "rs0", | ||
| members: [{_id: 0, host: "localhost:27017"}] | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider using the StatefulSet pod DNS as the RS member host.
Hardcoding localhost:27017 works for a replicas: 1 StatefulSet, but the linked issue's recommendation was to initiate with the service/pod DNS host so the member identity is stable and portable. If the chart ever changes replicas or if mongod's advertised host needs to match clients' connection string, a localhost member entry can cause subtle replication/discovery problems and forces an rs.reconfig() later.
A drop-in stable alternative for a StatefulSet pod is <pod>.<service>.<namespace>.svc.cluster.local, which can be passed in via the downward API:
♻️ Optional — use pod FQDN for the RS member
env:
+ - name: "POD_NAME"
+ valueFrom:
+ fieldRef:
+ fieldPath: "metadata.name"
+ - name: "POD_NAMESPACE"
+ valueFrom:
+ fieldRef:
+ fieldPath: "metadata.namespace"- mongosh --host localhost:27017 --eval '
- try { rs.status(); } catch(e) {
- rs.initiate({
- _id: "rs0",
- members: [{_id: 0, host: "localhost:27017"}]
- });
- }
- ' 2>/dev/null || true
+ HOST="${POD_NAME}.{{ include "clp.fullname" . }}-results-cache.${POD_NAMESPACE}.svc.cluster.local:27017"
+ mongosh --host localhost:27017 --eval "
+ try { rs.status(); } catch(e) {
+ rs.initiate({
+ _id: 'rs0',
+ members: [{_id: 0, host: '${HOST}'}]
+ });
+ }
+ "🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`
around lines 54 - 59, The rs.initiate call uses a hardcoded "localhost:27017"
which breaks stable member identity; change the mongosh --host invocation and
the members[0].host value in the rs.initiate payload to use the StatefulSet pod
FQDN (pod.service.namespace.svc.cluster.local with the :27017 port) injected via
Helm/downward API or template rendering instead of "localhost:27017", so the
rs.initiate and mongosh commands reference the stable pod DNS rather than
localhost.
…hook - Add retry counter (max 30 attempts) to the postStart until loop to prevent infinite hangs if MongoDB never accepts connections - Remove 2>/dev/null and || true from the rs.initiate() call so errors are visible in pod logs for debugging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tools/deployment/package-helm/templates/results-cache-statefulset.yaml (2)
61-68:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNarrow the
catchtoNotYetInitialized(code 94).The
try { rs.status(); } catch(e) { rs.initiate(...) }block still treats anyrs.status()failure as "uninitialized". Now that errors are no longer suppressed and a non‑zero exit from the hook will kill the container, a transient/auth/network failure onrs.status()will trigger anrs.initiate()that can fail for a different reason (e.g., already initialized with a different config), fail the postStart hook, and put the pod into a restart loop — which is more disruptive than the original silent path. Restricting the recovery toe.codeName === "NotYetInitialized" || e.code === 94keeps the intended fresh-PVC bootstrap and re-throws everything else.🛡️ Proposed narrowing
mongosh --host localhost:27017 --eval ' - try { rs.status(); } catch(e) { - rs.initiate({ - _id: "rs0", - members: [{_id: 0, host: "localhost:27017"}] - }); - } + try { + rs.status(); + } catch (e) { + if (e.codeName === "NotYetInitialized" || e.code === 94) { + rs.initiate({ + _id: "rs0", + members: [{_id: 0, host: "localhost:27017"}] + }); + } else { + throw e; + } + } '🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml` around lines 61 - 68, The catch in the mongosh postStart hook currently treats any rs.status() error as "uninitialized"; change the catch body to only call rs.initiate(...) when the error indicates NotYetInitialized by checking e.codeName === "NotYetInitialized" || e.code === 94, and re-throw (or let the error propagate) for all other errors so transient/auth/network failures don't trigger an inappropriate rs.initiate() or mask real failures; update the mongosh script around the rs.status() call accordingly to perform this conditional check before invoking rs.initiate().
63-66: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueHardcoded
localhost:27017RS member host — prior concern still open.The member host is still
localhost:27017. As noted on the previous commit, using the pod FQDN (<pod>.<service>.<namespace>.svc.cluster.local:27017) injected via the downward API yields a stable member identity and avoids futurers.reconfig()churn ifreplicasever changes or if clients expect the advertised host to match the connection string. Not a blocker forreplicas: 1, but worth resolving here rather than after the chart ships a baked-inrs0config.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml` around lines 63 - 66, The rs.initiate() call currently hardcodes the member host to "localhost:27017"; change it to use the pod FQDN injected via the downward API so the replica set member advertises a stable, cluster-scoped identity. Locate the rs.initiate invocation and replace the member host literal with the environment/template variable that contains "<pod>.<service>.<namespace>.svc.cluster.local:27017" (the value provided via the downward API), ensuring the string is injected at render/runtime so the replica set config uses the actual pod FQDN rather than "localhost".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`:
- Around line 61-68: The catch in the mongosh postStart hook currently treats
any rs.status() error as "uninitialized"; change the catch body to only call
rs.initiate(...) when the error indicates NotYetInitialized by checking
e.codeName === "NotYetInitialized" || e.code === 94, and re-throw (or let the
error propagate) for all other errors so transient/auth/network failures don't
trigger an inappropriate rs.initiate() or mask real failures; update the mongosh
script around the rs.status() call accordingly to perform this conditional check
before invoking rs.initiate().
- Around line 63-66: The rs.initiate() call currently hardcodes the member host
to "localhost:27017"; change it to use the pod FQDN injected via the downward
API so the replica set member advertises a stable, cluster-scoped identity.
Locate the rs.initiate invocation and replace the member host literal with the
environment/template variable that contains
"<pod>.<service>.<namespace>.svc.cluster.local:27017" (the value provided via
the downward API), ensuring the string is injected at render/runtime so the
replica set config uses the actual pod FQDN rather than "localhost".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5ceb5845-9ffe-459b-b119-e9cc359337d8
📒 Files selected for processing (1)
tools/deployment/package-helm/templates/results-cache-statefulset.yaml
There was a problem hiding this comment.
changes seem not related to the current PR
There was a problem hiding this comment.
changes seem not related to the current PR
There was a problem hiding this comment.
this will likely cause conflicts with the results-cache-indices-creator Job. we have been working on some fix locally over the past week. @hoophalab may coordinate the submission of the changes (either in this PR or a separate PR)
The compression-worker and query-worker node name changes belong in PR y-scope#2269, not this MongoDB replica set PR. Reverting to main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tools/deployment/package-helm/templates/compression-worker-deployment.yaml (1)
9-9:⚠️ Potential issue | 🟠 Major | ⚡ Quick winChanges unrelated to PR objectives and breaking change for existing deployments.
The replica count path change from
.Values.compressionWorker.replicasto.Values.scheduling.compressionWorker.replicasis unrelated to the MongoDB initialization fix described in the PR objectives. Additionally, this is a breaking change that will cause deployment failures for users with existingvalues.yamlfiles that definecompressionWorker.replicasat the top level instead of underscheduling.Verify that
.Values.scheduling.compressionWorker.replicasis properly defined:#!/bin/bash # Description: Check if the new value path is defined in values.yaml and verify migration path # Check if old path exists in values files echo "=== Checking for old path usage ===" rg -n "compressionWorker:" --type yaml -A 5 -g 'values*.yaml' # Check if new scheduling path exists echo -e "\n=== Checking for new scheduling path ===" rg -n "scheduling:" --type yaml -A 10 -g 'values*.yaml' # Look for any documentation about this breaking change echo -e "\n=== Checking for migration documentation ===" rg -n -i "scheduling.*replicas|migration|breaking" -g 'README*' -g 'CHANGELOG*' -g 'values*.yaml' -g '*.md'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml` at line 9, The template change moved the replica path to .Values.scheduling.compressionWorker.replicas which is a breaking change; update compression-worker-deployment.yaml to accept the old top-level key as a fallback so existing values.yaml continue to work: use the replicas field (replicas: {{ ... }}) to prefer .Values.scheduling.compressionWorker.replicas but fall back to .Values.compressionWorker.replicas (or a default) when the scheduling path is not defined, and add a short comment referencing both keys so maintainers know why both exist.tools/deployment/package-helm/templates/query-worker-deployment.yaml (1)
10-10:⚠️ Potential issue | 🟠 Major | ⚡ Quick winChanges unrelated to PR objectives and breaking change for existing deployments.
The replica count path change from
.Values.queryWorker.replicasto.Values.scheduling.queryWorker.replicasis unrelated to the MongoDB initialization fix described in the PR objectives. Additionally, this is a breaking change that will cause deployment failures for users with existingvalues.yamlfiles that definequeryWorker.replicasat the top level instead of underscheduling.Verify that
.Values.scheduling.queryWorker.replicasis properly defined:#!/bin/bash # Description: Check if the new value path is defined in values.yaml and verify migration path # Check if old path exists in values files echo "=== Checking for old path usage ===" rg -n "queryWorker:" --type yaml -A 5 -g 'values*.yaml' # Check if new scheduling path exists echo -e "\n=== Checking for new scheduling path ===" rg -n "scheduling:" --type yaml -A 10 -g 'values*.yaml' # Look for any documentation about this breaking change echo -e "\n=== Checking for migration documentation ===" rg -n -i "scheduling.*replicas|migration|breaking" -g 'README*' -g 'CHANGELOG*' -g 'values*.yaml' -g '*.md'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml` at line 10, The change replaced the replica value path from .Values.queryWorker.replicas to .Values.scheduling.queryWorker.replicas which is a breaking config change; restore backward compatibility by updating the deployment template that sets "replicas:" to accept both keys (use a conditional/fallback lookup so it prefers .Values.scheduling.queryWorker.replicas but falls back to .Values.queryWorker.replicas or a sensible default) and update docs/values.yaml to document the new path if you intend to migrate users.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`:
- Line 9: The template change moved the replica path to
.Values.scheduling.compressionWorker.replicas which is a breaking change; update
compression-worker-deployment.yaml to accept the old top-level key as a fallback
so existing values.yaml continue to work: use the replicas field (replicas: {{
... }}) to prefer .Values.scheduling.compressionWorker.replicas but fall back to
.Values.compressionWorker.replicas (or a default) when the scheduling path is
not defined, and add a short comment referencing both keys so maintainers know
why both exist.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml`:
- Line 10: The change replaced the replica value path from
.Values.queryWorker.replicas to .Values.scheduling.queryWorker.replicas which is
a breaking config change; restore backward compatibility by updating the
deployment template that sets "replicas:" to accept both keys (use a
conditional/fallback lookup so it prefers
.Values.scheduling.queryWorker.replicas but falls back to
.Values.queryWorker.replicas or a sensible default) and update docs/values.yaml
to document the new path if you intend to migrate users.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 26e1e1d9-883a-4762-8d04-62bac4af0e99
📒 Files selected for processing (2)
tools/deployment/package-helm/templates/compression-worker-deployment.yamltools/deployment/package-helm/templates/query-worker-deployment.yaml
Summary
postStartlifecycle hook to the results-cache (MongoDB) StatefulSet that automatically initializes the replica set (rs0) when the PVC is fresh/recreated.rs.initiate()only ifrs.status()fails (i.e., the replica set is not yet configured). This is idempotent and safe on normal restarts.0.3.2-dev.3.Problem
When the results-cache pod's PVC is deleted and recreated (e.g., during disaster recovery or testing), MongoDB starts in
RSGhoststate becausemongod.confhasreplication.replSetName: "rs0"but no replica set has been initiated on the fresh data directory. All clients then fail with"No servers match selector Primary()".Fixes #2257
Test plan
kubectl execinto the results-cache pod and confirmrs.status()shows a healthy primary🤖 Generated with Claude Code
Summary by CodeRabbit
Chores
Infrastructure Improvements
Fixes #2257