-
Notifications
You must be signed in to change notification settings - Fork 86
fix(controller): prevent NodeAgent restarts from ESO metadata updates #1999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(controller): prevent NodeAgent restarts from ESO metadata updates #1999
Conversation
Fixes ESO-234: NodeAgent was restarting every ~30s when External Secrets Operator managed the cloud-credentials secret. ESO's metadata-only updates were triggering unnecessary DPA reconciliations. Changes: - Updated labelHandler.Update() to skip reconciliation for Secret objects when only metadata changes (ResourceVersion, annotations, etc.) - Added comprehensive unit tests for labelHandler covering all scenarios - Maintains backward compatibility for non-Secret resources This prevents unnecessary NodeAgent daemonset updates while preserving reconciliation for actual data or label changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
weshayutin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @kaovilai ! Please do not cherry pick until 1.5.3. is GA :)
| // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) | ||
| if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && | ||
| reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) && | ||
| reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weshayutin although.. do we know if they are changing label every 30 seconds? are we supposed to ignore label changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple notes that I have would be.
- I could be wrong but it's my understanding the external secrets operator is tech-preview.
- I'm not the smartest guy in the world but it seems to me that ESO should check the secret but if no change is required, I think ANY metadata on the secret should remain UNCHANGED, hence why I moved the bug to ESO.
- I DO like being defensive in this case, and we either need to set it up and try/test or enquire w/ the ESO team re: labels.
- I don't think this is a priority for our attention atm based on my above understanding. I could be wrong though.
|
/retest ai-retester: The end-to-end test |
|
/retest ai-retester: The e2e test failed because the |
| // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) | ||
| if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && | ||
| reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) && | ||
| reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check for annotations ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are explicitly avoiding reconcile on annotation changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai, shubham-pampattiwar, weshayutin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
WalkthroughThe changes add a metadata-only change filter to the DataProtectionApplication controller's Secret reconciliation handler and introduce comprehensive unit tests for the labelHandler. The filter skips reconciliation when Secret objects have unchanged Data, StringData, and Labels fields. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (2)
internal/controller/dataprotectionapplication_controller.go(2 hunks)internal/controller/dataprotectionapplication_controller_labelhandler_test.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
internal/controller/dataprotectionapplication_controller.gointernal/controller/dataprotectionapplication_controller_labelhandler_test.go
🔇 Additional comments (6)
internal/controller/dataprotectionapplication_controller.go (1)
22-22: LGTM: Import needed for deep equality checks.The
reflectimport is necessary for theDeepEqualcomparisons in the metadata-only change filter.internal/controller/dataprotectionapplication_controller_labelhandler_test.go (5)
36-134: LGTM: Comprehensive Create event tests.The test coverage for
Createevents is thorough, validating all combinations of label presence/absence and correctly asserting enqueue behavior.
137-196: LGTM: Delete event tests cover the key scenarios.The Delete tests appropriately validate label-based enqueuing behavior.
199-433: LGTM: Update tests comprehensively validate the metadata-only filter.The test suite covers all critical scenarios:
- Data changes trigger reconciliation (lines 210-243)
- ESO metadata-only changes don't trigger reconciliation (lines 245-283)
- Label changes trigger reconciliation (lines 285-319)
- Non-Secret objects always trigger reconciliation (lines 354-387)
Note: The StringData test case (lines 321-352) validates the code path but doesn't reflect real Kubernetes behavior where
StringDatais write-only and empty when reading Secrets. This is acceptable for unit testing purposes.
436-494: LGTM: Generic event tests complete the coverage.The Generic event handler tests follow the same pattern as Create/Delete and appropriately validate behavior.
496-524: LGTM: Simple and effective mock queue.The
testQueuemock provides a minimal implementation ofworkqueue.TypedRateLimitingInterfacethat's sufficient for testing the labelHandler. The interface compliance is verified at line 524.
|
|
||
| // For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO) | ||
| // This prevents unnecessary reconciliations when external-secrets-operator updates metadata | ||
| if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok { | ||
| if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok { | ||
| // Skip reconciliation if data, stringData, and labels haven't changed | ||
| // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) | ||
| if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && | ||
| reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) && | ||
| reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) { | ||
| return | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Secret.Type to the comparison; StringData check is ineffective in practice.
The metadata-only filter correctly prevents ESO-triggered reconciliations, but:
-
Secret.Type is not checked: If the Secret type changes (e.g., from
Opaquetokubernetes.io/tls), this should trigger reconciliation, but currently won't. -
StringData is always empty when reading Secrets: Kubernetes converts
StringDatatoDataon write and returns emptyStringDataon read. BotholdSecret.StringDataandnewSecret.StringDatawill typically be empty in real scenarios, making this check always pass. While harmless, it doesn't provide the intended protection and could be misleading.
Apply this diff to add the Type field check:
// For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO)
// This prevents unnecessary reconciliations when external-secrets-operator updates metadata
if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok {
if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok {
// Skip reconciliation if data, stringData, and labels haven't changed
// This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.)
if reflect.DeepEqual(oldSecret.Data, newSecret.Data) &&
- reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) &&
- reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) {
+ reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) &&
+ oldSecret.Type == newSecret.Type {
return
}
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO) | |
| // This prevents unnecessary reconciliations when external-secrets-operator updates metadata | |
| if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok { | |
| if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok { | |
| // Skip reconciliation if data, stringData, and labels haven't changed | |
| // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) | |
| if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && | |
| reflect.DeepEqual(oldSecret.StringData, newSecret.StringData) && | |
| reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) { | |
| return | |
| } | |
| } | |
| } | |
| // For Secrets, check if only metadata changed (e.g., ResourceVersion updates from ESO) | |
| // This prevents unnecessary reconciliations when external-secrets-operator updates metadata | |
| if oldSecret, ok := evt.ObjectOld.(*corev1.Secret); ok { | |
| if newSecret, ok := evt.ObjectNew.(*corev1.Secret); ok { | |
| // Skip reconciliation if data, stringData, and labels haven't changed | |
| // This filters out metadata-only updates (ResourceVersion, ManagedFields, etc.) | |
| if reflect.DeepEqual(oldSecret.Data, newSecret.Data) && | |
| reflect.DeepEqual(oldSecret.Labels, newSecret.Labels) && | |
| oldSecret.Type == newSecret.Type { | |
| return | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In internal/controller/dataprotectionapplication_controller.go around lines 197
to 210, the metadata-only filter misses changes to Secret.Type and includes an
ineffective StringData comparison; update the condition to also compare
oldSecret.Type != newSecret.Type (so type changes trigger reconciliation) and
remove (or stop relying on) the StringData check because Kubernetes returns
empty StringData on reads—keep the reflect.DeepEqual checks for Data and Labels
and return only when Type, Data, and Labels are equal.
|
@kaovilai: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Fixes ESO-234: NodeAgent was restarting every ~30s when External Secrets
Operator managed the cloud-credentials secret. ESO's metadata-only updates
were triggering unnecessary DPA reconciliations.
Changes:
when only metadata changes (ResourceVersion, annotations, etc.)
This prevents unnecessary NodeAgent daemonset updates while preserving
reconciliation for actual data or label changes.
🤖 Generated with Claude Code
Co-Authored-By: Claude [email protected]
Why the changes were made
How to test the changes made
Continuation of #1998 on a thawed oadp-dev branch.