fix(controller): prevent NodeAgent restarts from ESO metadata updates #1999

kaovilai · 2025-10-21T21:30:51Z

Fixes ESO-234: NodeAgent was restarting every ~30s when External Secrets
Operator managed the cloud-credentials secret. ESO's metadata-only updates
were triggering unnecessary DPA reconciliations.

Changes:

Updated labelHandler.Update() to skip reconciliation for Secret objects
when only metadata changes (ResourceVersion, annotations, etc.)
Added comprehensive unit tests for labelHandler covering all scenarios
Maintains backward compatibility for non-Secret resources

This prevents unnecessary NodeAgent daemonset updates while preserving
reconciliation for actual data or label changes.

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

Why the changes were made

How to test the changes made

Continuation of #1998 on a thawed oadp-dev branch.

Fixes ESO-234: NodeAgent was restarting every ~30s when External Secrets Operator managed the cloud-credentials secret. ESO's metadata-only updates were triggering unnecessary DPA reconciliations. Changes: - Updated labelHandler.Update() to skip reconciliation for Secret objects when only metadata changes (ResourceVersion, annotations, etc.) - Added comprehensive unit tests for labelHandler covering all scenarios - Maintains backward compatibility for non-Secret resources This prevents unnecessary NodeAgent daemonset updates while preserving reconciliation for actual data or label changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

weshayutin

thanks @kaovilai ! Please do not cherry pick until 1.5.3. is GA :)

kaovilai · 2025-10-21T22:08:34Z

internal/controller/dataprotectionapplication_controller.go

+			// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)
+			if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&
+				reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) &&
+				reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) {


@weshayutin although.. do we know if they are changing label every 30 seconds? are we supposed to ignore label changes?

Couple notes that I have would be.

I could be wrong but it's my understanding the external secrets operator is tech-preview.

I'm not the smartest guy in the world but it seems to me that ESO should check the secret but if no change is required, I think ANY metadata on the secret should remain UNCHANGED, hence why I moved the bug to ESO.

I DO like being defensive in this case, and we either need to set it up and try/test or enquire w/ the ESO team re: labels.

I don't think this is a priority for our attention atm based on my above understanding. I could be wrong though.

kaovilai · 2025-10-23T06:48:45Z

/retest

ai-retester: The end-to-end test e2e-test-aws failed because the MySQL application KOPIA test timed out. Specifically the todolist container in pod todolist-1-vcz9l showed ContainersNotReady condition, failing to start after an extended period. This seems to be the primary cause, even though the *mysql container triggered warning messages regarding failed liveness probes too.

kaovilai · 2025-10-23T09:01:05Z

/retest

ai-retester: The e2e test failed because the MySQL application KOPIA test timed out and ultimately failed due to the todolist container in pod todolist-1-9znqw repeatedly failing readiness checks (containers not ready, "PodInitializing"). The underlying issues may have been with the liveness probe failing for the mysql database indicating i/o timeout .

shubham-pampattiwar · 2025-11-13T19:19:34Z

internal/controller/dataprotectionapplication_controller.go

+			// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)
+			if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&
+				reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) &&
+				reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) {


Should we also check for annotations ?

We are explicitly avoiding reconcile on annotation changes.

eg. avoiding reconcile on https://github.com/external-secrets/external-secrets/blob/579135b5bb558b8b79f71bfc57f1a852968abac4/docs/provider/ibm-secrets-manager.md?plain=1#L283-L284

shubham-pampattiwar · 2025-11-13T22:02:58Z

/lgtm

openshift-ci · 2025-11-13T22:03:00Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kaovilai, shubham-pampattiwar, weshayutin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [kaovilai,shubham-pampattiwar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-11-13T22:03:22Z

Walkthrough

The changes add a metadata-only change filter to the DataProtectionApplication controller's Secret reconciliation handler and introduce comprehensive unit tests for the labelHandler. The filter skips reconciliation when Secret objects have unchanged Data, StringData, and Labels fields.

Changes

Cohort / File(s)	Summary
DataProtectionApplication Controller Enhancement `internal/controller/dataprotectionapplication_controller.go`	Added `reflect` import and introduced conditional logic in labelHandler.Update to skip reconciliation for Secret objects when only metadata changes (deep equality check on Data, StringData, and Labels). Prevents unnecessary reconciliations from metadata-only updates like ResourceVersion or ManagedFields changes.
LabelHandler Unit Tests `internal/controller/dataprotectionapplication_controller_labelhandler_test.go`	New comprehensive test file covering labelHandler behavior across Create, Delete, Update, and Generic event paths. Tests verify correct enqueuing based on label presence and values, including edge cases with missing labels, empty values, non-Secret objects (ConfigMap), StringData changes, and ESO-related metadata updates. Includes testQueue mock implementing workqueue.TypedRateLimitingInterface.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The deep equality check implementation for Secret fields (Data, StringData, Labels) warrants verification to ensure it correctly identifies metadata-only changes without missing legitimate reconciliation triggers
Test coverage completeness across all event types and edge cases (Create, Delete, Update, Generic paths) should be validated
Verify that the skip logic doesn't inadvertently suppress reconciliations for meaningful Secret updates

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 44d4afc and 69a49ba.

📒 Files selected for processing (2)

internal/controller/dataprotectionapplication_controller.go (2 hunks)
internal/controller/dataprotectionapplication_controller_labelhandler_test.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

internal/controller/dataprotectionapplication_controller.go
internal/controller/dataprotectionapplication_controller_labelhandler_test.go

🔇 Additional comments (6)

internal/controller/dataprotectionapplication_controller.go (1)

22-22: LGTM: Import needed for deep equality checks.

The reflect import is necessary for the DeepEqual comparisons in the metadata-only change filter.

internal/controller/dataprotectionapplication_controller_labelhandler_test.go (5)

36-134: LGTM: Comprehensive Create event tests.

The test coverage for Create events is thorough, validating all combinations of label presence/absence and correctly asserting enqueue behavior.

137-196: LGTM: Delete event tests cover the key scenarios.

The Delete tests appropriately validate label-based enqueuing behavior.

199-433: LGTM: Update tests comprehensively validate the metadata-only filter.

The test suite covers all critical scenarios:

Data changes trigger reconciliation (lines 210-243)

ESO metadata-only changes don't trigger reconciliation (lines 245-283)

Label changes trigger reconciliation (lines 285-319)

Non-Secret objects always trigger reconciliation (lines 354-387)

Note: The StringData test case (lines 321-352) validates the code path but doesn't reflect real Kubernetes behavior where StringData is write-only and empty when reading Secrets. This is acceptable for unit testing purposes.

436-494: LGTM: Generic event tests complete the coverage.

The Generic event handler tests follow the same pattern as Create/Delete and appropriately validate behavior.

496-524: LGTM: Simple and effective mock queue.

The testQueue mock provides a minimal implementation of workqueue.TypedRateLimitingInterface that's sufficient for testing the labelHandler. The interface compliance is verified at line 524.

coderabbitai · 2025-11-13T22:08:19Z

internal/controller/dataprotectionapplication_controller.go

+
+	// For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO)
+	// This prevents unnecessary reconciliations when external-secrets-operator updates metadata
+	if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok {
+		if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok {
+			// Skip reconciliation if data, stringData, and labels haven't changed
+			// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)
+			if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&
+				reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) &&
+				reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) {
+				return
+			}
+		}
+	}


⚠️ Potential issue | 🟠 Major

Add Secret.Type to the comparison; StringData check is ineffective in practice.

The metadata-only filter correctly prevents ESO-triggered reconciliations, but:

Secret.Type is not checked: If the Secret type changes (e.g., from Opaque to kubernetes.io/tls), this should trigger reconciliation, but currently won't.

StringData is always empty when reading Secrets: Kubernetes converts StringData to Data on write and returns empty StringData on read. Both oldSecret.StringData and newSecret.StringData will typically be empty in real scenarios, making this check always pass. While harmless, it doesn't provide the intended protection and could be misleading.

Apply this diff to add the Type field check:

// For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO) // This prevents unnecessary reconciliations when external-secrets-operator updates metadata if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok { if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok { // Skip reconciliation if data, stringData, and labels haven't changed // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && - reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) && - reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) { + reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) && + oldSecret.Type == newSecret.Type { return } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO)

// This prevents unnecessary reconciliations when external-secrets-operator updates metadata

if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok {

if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok {

// Skip reconciliation if data, stringData, and labels haven't changed

// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)

if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&

reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) &&

reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) {

return

}

}

}

// For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO)

// This prevents unnecessary reconciliations when external-secrets-operator updates metadata

if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok {

if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok {

// Skip reconciliation if data, stringData, and labels haven't changed

// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)

if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&

reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) &&

oldSecret.Type == newSecret.Type {

return

}

}

}

🤖 Prompt for AI Agents

In internal/controller/dataprotectionapplication_controller.go around lines 197 to 210, the metadata-only filter misses changes to Secret.Type and includes an ineffective StringData comparison; update the condition to also compare oldSecret.Type != newSecret.Type (so type changes trigger reconciliation) and remove (or stop relying on) the StringData check because Kubernetes returns empty StringData on reads—keep the reflect.DeepEqual checks for Data and Labels and return only when Type, Data, and Labels are equal.

openshift-ci · 2025-11-14T01:32:00Z

@kaovilai: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot requested review from mrnold and shubham-pampattiwar October 21, 2025 21:31

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 21, 2025

weshayutin approved these changes Oct 21, 2025

View reviewed changes

kaovilai commented Oct 21, 2025

View reviewed changes

shubham-pampattiwar reviewed Nov 13, 2025

View reviewed changes

shubham-pampattiwar approved these changes Nov 13, 2025

View reviewed changes

openshift-ci bot assigned shubham-pampattiwar Nov 13, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 13, 2025

coderabbitai bot reviewed Nov 13, 2025

View reviewed changes

openshift-merge-bot bot merged commit f513b02 into openshift:oadp-dev Nov 14, 2025
15 checks passed

fix(controller): prevent NodeAgent restarts from ESO metadata updates #1999

fix(controller): prevent NodeAgent restarts from ESO metadata updates #1999

Uh oh!

Conversation

kaovilai commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why the changes were made

How to test the changes made

Uh oh!

weshayutin left a comment

Choose a reason for hiding this comment

Uh oh!

kaovilai Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

weshayutin Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

kaovilai commented Oct 23, 2025

Uh oh!

kaovilai commented Oct 23, 2025

Uh oh!

shubham-pampattiwar Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

kaovilai Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

kaovilai Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

shubham-pampattiwar commented Nov 13, 2025

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

coderabbitai bot commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaovilai commented Oct 21, 2025 •

edited

Loading

coderabbitai bot commented Nov 13, 2025 •

edited

Loading