[DNM] Managed vtep mode test by tssurya · Pull Request #3246 · openshift/ovn-kubernetes

tssurya · 2026-06-11T09:00:11Z

just testing ovn-kubernetes/ovn-kubernetes#6353 before we merge it u/s

Summary by CodeRabbit

Release Notes

New Features
- Managed VTEP mode now fully supported with automatic per-node IP allocation from configured CIDRs.
- Added runtime mode-switching capability between Managed and Unmanaged VTEP modes with automatic cleanup and state preservation.
- Enhanced EVPN integration with Route Advertisements support for managed VTEPs.
Documentation
- Added comprehensive guide for managed VTEP mode configuration, CIDR constraints, and mode transition procedures.

Relax the validation in newSubnetAllocatorRange to accept hostSubnetLen equal to the address length (32 for IPv4, 128 for IPv6). Previously, hostSubnetLen >= addrLen was rejected with "host capacity cannot be zero", preventing single-IP allocations. This is needed for managed-mode VTEP IP allocation in the cluster manager. The VTEP controller needs to allocate individual IPs (one per node) from VTEP CIDRs. The node.SubnetAllocator is the best fit among the available CM-side allocators because it: - Supports multiple ranges per IP family with natural overflow (when the first CIDR is full, allocation falls through to the next) - Is IP-family-aware with separate v4/v6 range tracking - Has built-in owner tracking (MarkAllocatedNetworks for restore, ReleaseAllNetworks for cleanup) Other allocators were considered but don't fit: - subnet.Allocator (pod IPAM): allocates one IP per CIDR entry, not one per family; no overflow across same-family CIDRs - ip.Range (bitmap): too low-level; single CIDR, no owner tracking, no family awareness - HybridConnectSubnetAllocator: specialized for L2/L3 layered network-connect allocation, not reusable With hostBits=0, the generateSubnet math produces correct /32 or /128 entries: the subnet number is shifted by 0 bits, and each allocation maps to exactly one IP address. The IPv4 alignment optimization is naturally skipped (hostBits%8 == 0). Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Add a thin wrapper around node.SubnetAllocator that uses /32 (IPv4) and /128 (IPv6) host subnet lengths to allocate individual VTEP IPs for managed-mode VTEPs. The wrapper provides VTEP-specific convenience methods: - allocateForNode: allocates the next available IP per family - markAllocatedForNode: reserves specific IPs (for restore from node annotations during initial sync) - releaseNode: frees all allocations for a node - addCIDR: dynamically adds a new range (for CIDR expansion) The underlying SubnetAllocator handles multi-range overflow (first CIDR exhausted before the next is used), IP-family separation (IPv4 and IPv6 ranges overflow independently), and owner-based tracking (idempotent allocation, conflict detection on mark). Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add SetNodeVTEPEntry and RemoveNodeVTEPEntry to vtep_node_annotations.go. Both perform a read-modify-write of the shared k8s.ovn.org/vteps annotation with RetryOnConflict, accepting getNode/updateNode function parameters so they can be used by both the cluster-manager (via lister + kube.UpdateNodeStatus) and ovnkube-node (via direct API client) without coupling the util package to either. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Replace the handleManagedModeNotSupported stub with actual managed-mode logic. For managed VTEPs, the cluster-manager now allocates VTEP IPs from the configured CIDRs and writes them to the k8s.ovn.org/vteps node annotation. The unmanaged path (node discovers IPs and writes the annotation) is unchanged. Key design decisions: Annotation write strategy: The k8s.ovn.org/vteps annotation is a single JSON blob shared between the cluster-manager (managed VTEPs) and ovnkube-node (unmanaged VTEPs). Both sides do read-modify-write on the same annotation key. To avoid silent data loss from concurrent writes, the CM uses UpdateNodeStatus (which carries resourceVersion) wrapped in RetryOnConflict, following the same pattern as NodeAllocator.updateNodeNetworkAnnotationsWithRetry for the k8s.ovn.org/node-subnets annotation. On conflict the CM re-reads from the informer cache; DefaultBackoff's exponential delays (10ms, 20ms, 40ms, …) give the informer time to sync. Note: on retry we re-read from the informer cache (nodeLister), not the API server directly. If the informer hasn't caught up after a conflict, the retry will send the same stale resourceVersion and get another 409. This is acceptable because DefaultBackoff provides 5 retries with exponential delays, giving the informer enough time to process the watch event. This is the same trade-off the NodeAllocator makes and has been reliable in production. This differs from the node side, which currently uses SetAnnotationsOnNodeWithFieldManager (strategic merge patch, never gets 409). In theory the node's patch can overwrite the CM's entry if it races, but the next VTEP reconcile will detect the stale annotation and re-write it, so the system self-heals. TODO: The node-side EVPN controller should also move to UpdateNodeStatus + RetryOnConflict for full bidirectional conflict safety. This requires reworking the node controller's in-memory annotation cache (c.vtepsAnnotation), the defer-based rollback in setVTEPAnnotation, and the IsLastUpdatedByManager field-manager filtering. This is left as a follow-up since it touches a separate controller with its own test suite. A per-VTEP annotation key (e.g. k8s.ovn.org/vtep-<name>) was considered to avoid the shared-key race entirely, but rejected because VTEP names can be up to 253 characters, exceeding the 63-character annotation key name segment limit. Node creates: nodeNeedsUpdate now accepts creates (oldObj==nil) so that managed-mode VTEPs allocate IPs for newly joined nodes. Without this a node joining a running cluster would not get a VTEP IP until an unrelated event triggered a VTEP reconcile. For unmanaged mode the extra reconcile is harmless since validateNodeVTEPIPs will report AllocationFailed until the node writes the annotation. Self-triggered reconcile: Writing the annotation triggers an informer event that re-queues all VTEPs via reconcileNode. The second reconcile is a no-op because allocateAndAnnotateNode checks whether the annotation already matches before entering the retry loop. This avoids infinite reconcile loops without needing field-manager filtering. Changes: - Remove reasonManagedModeNotSupported from status.go - Add kube.Interface and allocators map (mutex-protected) to Controller - Add handleManagedMode, getOrCreateAllocator, allocateAndAnnotateNode - Add cleanupManagedVTEPAnnotations, removeVTEPAnnotationEntry for deletion cleanup - Deduplicate managed/unmanaged error handling in reconcileVTEP - Update nodeNeedsUpdate to accept node creates - Rewrite "Managed mode" tests to exercise actual allocation and annotation writing Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add syncManagedAllocators() which runs once at controller startup via StartWithInitialSync. It lists all managed VTEPs and nodes, then marks previously allocated IPs (read from node annotations) in the allocator so they are not reassigned after a cluster-manager restart. Parse errors and missing annotations emit warnings but do not block startup, since a stuck cluster-manager would also block nodes. Critical errors (invalid allocator CIDRs, unparseable IPs, markAllocated failures) are collected and returned to prevent starting with inconsistent allocator state. The ordering guarantee of StartWithInitialSync ensures: 1. Event handlers are registered (queuing events during sync) 2. Allocator state is restored from existing annotations 3. Workers begin processing -- new allocations skip already-reserved IPs Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add ReplaceNetworkRange(oldNetwork, newNetwork *net.IPNet, hostSubnetLen int) to the SubnetAllocator interface and BaseSubnetAllocator. It replaces an existing range with a wider supernet in-place, preserving all existing allocations by copying the old range's allocMap and used count into the new subnetAllocatorRange. The allocation cursor (next) is reset to 0; allocateNetwork skips already-taken IPs via allocMap, so this is safe. Validation ensures newNetwork is a genuine supernet of oldNetwork (wider or equal prefix AND old base IP contained in new network). Returns an error if oldNetwork is not found or newNetwork is not a supernet. This will be used by the managed VTEP controller to handle CIDR expansion in managed mode. CEL rules on VTEPSpec guarantee that existing CIDRs can only be widened (not shrunk or reordered) in managed mode, so replacing the old range with the expanded supernet is always valid. Also add replaceRange(oldCIDR, newCIDR vtepv1.CIDR) to vtepIPAllocator as a typed wrapper for use in the mutability commit. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Handle VTEP spec and cluster mutations for managed-mode VTEPs: CIDR changes (append / widen): - Track CIDRs used to build each vtepIPAllocator in a positionally-ordered slice. A set or ListAllNetworks() cannot be used because CEL validates CIDRs positionally (existing entries can only be widened in-place at the same index; new ones can only be appended). - On reconcile, getOrUpdateAllocator replaces getOrCreateAllocator: detect new CIDRs (append addCIDR) or widened CIDRs (replaceRange via the new ReplaceNetworkRange from the previous commit). Existing allocations are preserved in-place; no rebuild needed. Unmanaged -> Managed mode switch: - On first managed reconcile, mark existing node annotations in the allocator immediately (same reconcile, not deferred) so in-range IPs written by ovnkube-node during the unmanaged phase are preserved. - Stale IPs (out-of-range after a CIDR change while unmanaged) are warned and skipped; handleManagedMode overwrites them with fresh allocations within the same reconcile pass. Managed -> Unmanaged mode switch: - Delete the allocator and clean up CM-written node annotations (once, guarded by wasManaged). After cleanup the allocator is gone so the guard is false on all subsequent reconciles and the CM never touches the annotation again. - Race with node-side during the cleanup window is safe: both sides use RetryOnConflict+UpdateNodeStatus (TODO: node-side currently uses strategic merge patch; it should be migrated to the same pattern). The CM cleans up exactly once; node-side eventually wins and the system converges. Node delete: - reconcileNode detects node deletion (IsNotFound from lister) and calls releaseNodeFromAllocators, freeing the IP for reuse by future nodes. Scale note: - handleManagedMode iterates all nodes on every reconcile. At 5000 nodes x 20 VTEPs this is 100k in-memory lister lookups + small JSON unmarshals per event burst, which is acceptable. A per-VTEP pending-nodes set could reduce redundant work at larger scale if needed. Tests added under "Mutability" context: - CIDR append: exhaust /30, append second /30, verify AllocationFailed clears and node-5 gets IP from new range. - CIDR widen: exhaust /30, widen to /28, same verification. - Unmanaged->Managed preserves existing non-sequential node IPs. - Unmanaged->Managed with changed CIDRs overwrites stale IPs. - Unmanaged->Managed with two CIDRs, one dropped: in-range IPs preserved, stale IPs from dropped CIDR get fresh allocations. - Managed->Unmanaged: allocator removed, annotations cleaned up. - Node delete releases allocation, freeing IP for new node (in "Node watch" context). Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Add an end-to-end integration test to clustermanager_test.go that exercises the full managed VTEP + RouteAdvertisements user workflow: - VTEP controller allocates one IP per node from the managed CIDR and writes k8s.ovn.org/vteps annotations. - UDN controller creates a NAD for the CUDN in the selected namespace. - RA controller accepts the RouteAdvertisements CR and generates one FRRConfiguration per node containing a /32 host route for the CM-allocated VTEP IP. Also add AddRAApplyReactor and AddFRRGenerateNameReactor helpers to pkg/testing/util.go to paper over fake-client limitations: the RA controller uses server-side apply for status updates (not supported by the fake client) and generates FRRConfiguration names via GenerateName (also not supported natively). Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

In managed mode the CM sequentially allocates VTEP IPs from the configured CIDRs. If those CIDRs overlap with the machine network (or any other network present on node interfaces), the allocator can hand out an IP that is already in use on another node's interface, causing two nodes to claim the same address on the same L2 segment. Add validateManagedCIDRsAgainstNodeIPs to the reconcile path. It reads k8s.ovn.org/host-cidrs from every node — covering primary, secondary, bond, and VLAN addresses — and rejects the VTEP with Accepted=False / CIDROverlap when any host CIDR overlaps with a managed VTEP CIDR. Nodes are checked individually because they can be on different subnets (multi-rack, multi-AZ). Also on the node side we expect managed VTEPs to not be added to host-cidrs annotation so that the above logic holds. Unmanaged VTEPs are skipped because the user controls IP assignment and may intentionally use node IPs as VTEP IPs. Also adds a comment to validateNodeVTEPIPs explaining why CM trusts the node-side annotation for unmanaged VTEPs without verifying CIDR membership. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Document the user-facing procedures for switching between Managed and Unmanaged VTEP modes, including required pre-switch actions and transient state expectations. Add a section explaining the managed VTEP CIDR non-overlap requirement against node host IPs. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

openshift-ci · 2026-06-11T09:00:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [tssurya]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-11T09:01:04Z

Walkthrough

This PR implements managed VTEP mode for OVN Kubernetes cluster manager, enabling dynamic per-node VTEP IP allocation from configured CIDRs with allocator state persistence. Changes include subnet allocator range replacement, VTEP IP allocation wrapper, node annotation helpers, comprehensive test coverage, and user documentation of the new mode and switching procedures.

Changes

Managed VTEP Mode Feature

Layer / File(s)	Summary
Subnet Allocator Range Replacement `go-controller/pkg/clustermanager/node/subnet_allocator.go`, `go-controller/pkg/clustermanager/node/subnet_allocator_test.go`	`ReplaceNetworkRange` expands existing CIDR allocations to wider supernets while preserving prior allocations (`allocMap`, `used`, `next`), with validation that the new range must contain the old base IP and have a shorter prefix. Tests verify preservation, error cases (missing range, non-supernet), and IPv6 behavior.
VTEP IP Allocator `go-controller/pkg/clustermanager/vtep/allocator.go`, `go-controller/pkg/clustermanager/vtep/allocator_test.go`	`vtepIPAllocator` wraps `node.SubnetAllocator` to allocate single-IP VTEP subnets (`/32` for IPv4, `/128` for IPv6) per node, with CIDR tracking for managed-mode range updates, state restoration from node annotations, and overflow handling. Tests cover allocation idempotency, multi-CIDR handling, and pool exhaustion.
Node Annotation Retry-Safe Updates `go-controller/pkg/util/vtep_node_annotations.go`	`SetNodeVTEPEntry` and `RemoveNodeVTEPEntry` wrap updates to the `k8s.ovn.org/vteps` node annotation in `retry.RetryOnConflict` to handle concurrent modifications, with internal `applyVTEPAnnotation` helper that marshals/unmarshals annotation state.
Test Infrastructure and Reactors `go-controller/pkg/testing/util.go`, `go-controller/pkg/clustermanager/routeadvertisements/controller_test.go`	`AddGenerateNameReactor` assigns deterministic names to objects with `GenerateName` using atomic counters; `AddRAApplyReactor` intercepts fake-client `patch` operations on RouteAdvertisements status subresource; `FakeClientWithReactor` interface abstracts reactor support across clientset types. Route Advertisements test refactored to use shared helper.
VTEP Controller Managed Mode Implementation `go-controller/pkg/clustermanager/vtep/controller.go`	Extends controller with `StartWithInitialSync` to restore allocator state from node annotations, adds `handleManagedMode` that allocates per-node IPs and updates annotations via retry-safe helpers, adds `validateManagedCIDRsAgainstNodeIPs` to reject VTEPs whose CIDRs overlap node host CIDRs, handles Managed→Unmanaged mode transitions with annotation cleanup and allocator drop, and manages allocator cleanup on node/VTEP deletion.
VTEP Controller Test Coverage `go-controller/pkg/clustermanager/vtep/controller_test.go`	Comprehensive test suite covering managed mode allocation scenarios, multi-CIDR behavior, CIDR exhaustion/overflow, late node joins, allocator state preservation across restarts, CIDR append/widen updates, mode switching with stale IP overwrites, managed-to-host-CIDR overlap validation, node deletion releasing allocations, and annotation idempotency to avoid reconcile loops.
ClusterManager Integration Test `go-controller/pkg/clustermanager/clustermanager_test.go`	End-to-end integration test creating a managed VTEP, EVPN CUDN referencing it, and RouteAdvertisements CR; verifies NAD creation, RouteAdvertisements Accepted=True, per-node VTEP IP allocation from CIDR, and generation of per-node FRRConfigurations with `/32` routes for allocated VTEP IPs.
Documentation and User Guidance `docs/features/bgp-integration/evpn.md`	Updates EVPN documentation to describe managed VTEP mode as supported, explaining IP allocation from configured CIDRs on `evlo-*` dummy devices and enforcing non-overlap with node host CIDRs. Adds "Mode Switching" section with step-by-step procedures for Unmanaged→Managed and Managed→Unmanaged transitions, including `kubectl patch` commands and transient state expectations.

🐰 A cluster blooms with IPs so bright,
Each node gets its VTEP, allocated just right!
Mode switches smooth, from manage to free,
The allocator persists—oh, what harmony! ✨

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title '[DNM] Managed vtep mode test' uses the prefix '[DNM]' (Do Not Merge), which is misleading because the PR appears to be a substantial feature implementation, not a draft or experimental change pending on external approval.	Remove the [DNM] prefix and use a descriptive title like 'Implement managed VTEP mode with IP allocation and dynamic CIDR support' to accurately reflect the comprehensive feature addition.
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.43.0)

go-controller/pkg/clustermanager/clustermanager_test.go

[]

go-controller/pkg/clustermanager/vtep/controller_test.go

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

go-controller/pkg/clustermanager/vtep/controller_test.go (1)
1346-1357: 💤 Low value

Misleading test name: no CIDR expansion occurs in this test.

The test name suggests a recovery scenario where CIDRs are expanded, but no expansion happens. The node IP 200.10.0.5 is outside the VTEP CIDR 100.64.0.0/16, and the test immediately expects Accepted=True without any update.

This test actually verifies that unmanaged VTEPs accept any IP in the annotation regardless of CIDR containment. Consider renaming to something like:
-ginkgo.It("recovers from AllocationFailed when VTEP CIDRs are expanded to match node IPs", func() {
+ginkgo.It("accepts node VTEP IPs regardless of CIDR containment", func() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@go-controller/pkg/clustermanager/vtep/controller_test.go` around lines 1346 -
1357, The test name is misleading: it claims CIDR expansion but no expansion
occurs—update the ginkgo.It description string to reflect that unmanaged VTEPs
accept annotation IPs outside the configured CIDR (e.g., "accepts annotation IPs
outside CIDR for unmanaged VTEP") so the intent matches the setup using
newNodeWithVTEPAnnotation("node-expand", ...), newVTEP("vtep-expand",
vtepv1.VTEPModeUnmanaged, "100.64.0.0/16"), and the assertions against
fakeVTEP/conditionTypeAccepted and reasonAllocated; keep the test logic
unchanged but rename the case and any related comment to avoid implying CIDR
expansion.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@go-controller/pkg/clustermanager/node/subnet_allocator.go`:
- Around line 118-166: The comment in ReplaceNetworkRange incorrectly states
"The allocation cursor (next) is reset to 0" while the implementation copies the
cursor (newSnr.next = snr.next); update the comment to reflect that the
allocation cursor is preserved to avoid redundant scans, and remove the
assertion about resetting to 0 — reference ReplaceNetworkRange, newSnr.next, and
snr.next to locate the code and adjust the explanatory text accordingly.

---

Nitpick comments:
In `@go-controller/pkg/clustermanager/vtep/controller_test.go`:
- Around line 1346-1357: The test name is misleading: it claims CIDR expansion
but no expansion occurs—update the ginkgo.It description string to reflect that
unmanaged VTEPs accept annotation IPs outside the configured CIDR (e.g.,
"accepts annotation IPs outside CIDR for unmanaged VTEP") so the intent matches
the setup using newNodeWithVTEPAnnotation("node-expand", ...),
newVTEP("vtep-expand", vtepv1.VTEPModeUnmanaged, "100.64.0.0/16"), and the
assertions against fakeVTEP/conditionTypeAccepted and reasonAllocated; keep the
test logic unchanged but rename the case and any related comment to avoid
implying CIDR expansion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Enterprise

Run ID: 71efab2e-77b0-4d96-9dea-129b8a29cdc2

📥 Commits

Reviewing files that changed from the base of the PR and between e9295c0 and 42ec35c.

📒 Files selected for processing (12)

docs/features/bgp-integration/evpn.md
go-controller/pkg/clustermanager/clustermanager_test.go
go-controller/pkg/clustermanager/node/subnet_allocator.go
go-controller/pkg/clustermanager/node/subnet_allocator_test.go
go-controller/pkg/clustermanager/routeadvertisements/controller_test.go
go-controller/pkg/clustermanager/vtep/allocator.go
go-controller/pkg/clustermanager/vtep/allocator_test.go
go-controller/pkg/clustermanager/vtep/controller.go
go-controller/pkg/clustermanager/vtep/controller_test.go
go-controller/pkg/clustermanager/vtep/status.go
go-controller/pkg/testing/util.go
go-controller/pkg/util/vtep_node_annotations.go

coderabbitai · 2026-06-11T09:29:32Z

+// ReplaceNetworkRange replaces an existing range (identified by oldNetwork)
+// with a new, wider range (newNetwork) of the same IP family. Existing
+// allocations from the old range are preserved: each entry in the old
+// range's allocMap is copied into the new range's allocMap since every
+// previously allocated IP still falls within the supernet. The allocation
+// cursor (next) is reset to 0; allocateNetwork will skip any already-taken
+// IPs via allocMap, so the reset is safe and avoids cursor arithmetic against
+// the new subnetBits.
+func (sna *BaseSubnetAllocator) ReplaceNetworkRange(oldNetwork, newNetwork *net.IPNet, hostSubnetLen int) error {
+	sna.Lock()
+	defer sna.Unlock()
+
+	newSnr, err := newSubnetAllocatorRange(newNetwork, hostSubnetLen)
+	if err != nil {
+		return fmt.Errorf("invalid new network range %s: %w", newNetwork, err)
+	}
+
+	// Verify that newNetwork is a supernet of oldNetwork: the old network's
+	// base IP must be contained in the new network, and the new prefix must
+	// be shorter (wider).
+	oldOnes, _ := oldNetwork.Mask.Size()
+	newOnes, _ := newNetwork.Mask.Size()
+	if !newNetwork.Contains(oldNetwork.IP) || newOnes > oldOnes {
+		return fmt.Errorf("new network %s is not a supernet of old network %s", newNetwork, oldNetwork)
+	}
+
+	ranges := &sna.v4ranges
+	if utilnet.IsIPv6(newNetwork.IP) {
+		ranges = &sna.v6ranges
+	}
+
+	for i, snr := range *ranges {
+		if snr.network.String() == oldNetwork.String() {
+			// Copy existing allocations into the new range.
+			for subnet, owner := range snr.allocMap {
+				newSnr.allocMap[subnet] = owner
+			}
+			newSnr.used = snr.used
+			// Preserve the allocation cursor. The enumeration order of IPs
+			// within the range is deterministic and stable across a widen
+			// (subnetBits increases but existing indices stay the same), so
+			// copying next avoids a redundant linear scan from 0.
+			newSnr.next = snr.next
+			(*ranges)[i] = newSnr
+			return nil
+		}
+	}
+	return fmt.Errorf("network range %s not found", oldNetwork)
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Outdated comment contradicts implementation.

The comment on lines 122-125 states "The allocation cursor (next) is reset to 0", but the implementation at line 160 actually preserves the cursor with newSnr.next = snr.next. The code is correct (preserving the cursor avoids redundant scans), but the comment is misleading.

📝 Fix the comment

// ReplaceNetworkRange replaces an existing range (identified by oldNetwork) // with a new, wider range (newNetwork) of the same IP family. Existing // allocations from the old range are preserved: each entry in the old // range's allocMap is copied into the new range's allocMap since every // previously allocated IP still falls within the supernet. The allocation -// cursor (next) is reset to 0; allocateNetwork will skip any already-taken -// IPs via allocMap, so the reset is safe and avoids cursor arithmetic against -// the new subnetBits. +// cursor (next) is preserved from the old range. The enumeration order is +// stable across widening (subnetBits increases but existing indices stay the +// same), so copying next avoids redundant linear scans from 0. func (sna *BaseSubnetAllocator) ReplaceNetworkRange(oldNetwork, newNetwork *net.IPNet, hostSubnetLen int) error {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// ReplaceNetworkRange replaces an existing range (identified by oldNetwork)

// with a new, wider range (newNetwork) of the same IP family. Existing

// allocations from the old range are preserved: each entry in the old

// range's allocMap is copied into the new range's allocMap since every

// previously allocated IP still falls within the supernet. The allocation

// cursor (next) is reset to 0; allocateNetwork will skip any already-taken

// IPs via allocMap, so the reset is safe and avoids cursor arithmetic against

// the new subnetBits.

func (sna *BaseSubnetAllocator) ReplaceNetworkRange(oldNetwork, newNetwork *net.IPNet, hostSubnetLen int) error {

sna.Lock()

defer sna.Unlock()

newSnr, err := newSubnetAllocatorRange(newNetwork, hostSubnetLen)

if err != nil {

return fmt.Errorf("invalid new network range %s: %w", newNetwork, err)

}

// Verify that newNetwork is a supernet of oldNetwork: the old network's

// base IP must be contained in the new network, and the new prefix must

// be shorter (wider).

oldOnes, _ := oldNetwork.Mask.Size()

newOnes, _ := newNetwork.Mask.Size()

if !newNetwork.Contains(oldNetwork.IP) || newOnes > oldOnes {

return fmt.Errorf("new network %s is not a supernet of old network %s", newNetwork, oldNetwork)

}

ranges := &sna.v4ranges

if utilnet.IsIPv6(newNetwork.IP) {

ranges = &sna.v6ranges

}

for i, snr := range *ranges {

if snr.network.String() == oldNetwork.String() {

// Copy existing allocations into the new range.

for subnet, owner := range snr.allocMap {

newSnr.allocMap[subnet] = owner

}

newSnr.used = snr.used

// Preserve the allocation cursor. The enumeration order of IPs

// within the range is deterministic and stable across a widen

// (subnetBits increases but existing indices stay the same), so

// copying next avoids a redundant linear scan from 0.

newSnr.next = snr.next

(*ranges)[i] = newSnr

return nil

}

}

return fmt.Errorf("network range %s not found", oldNetwork)

}

// ReplaceNetworkRange replaces an existing range (identified by oldNetwork)

// with a new, wider range (newNetwork) of the same IP family. Existing

// allocations from the old range are preserved: each entry in the old

// range's allocMap is copied into the new range's allocMap since every

// previously allocated IP still falls within the supernet. The allocation

// cursor (next) is preserved from the old range. The enumeration order is

// stable across widening (subnetBits increases but existing indices stay the

// same), so copying next avoids redundant linear scans from 0.

func (sna *BaseSubnetAllocator) ReplaceNetworkRange(oldNetwork, newNetwork *net.IPNet, hostSubnetLen int) error {

sna.Lock()

defer sna.Unlock()

newSnr, err := newSubnetAllocatorRange(newNetwork, hostSubnetLen)

if err != nil {

return fmt.Errorf("invalid new network range %s: %w", newNetwork, err)

}

// Verify that newNetwork is a supernet of oldNetwork: the old network's

// base IP must be contained in the new network, and the new prefix must

// be shorter (wider).

oldOnes, _ := oldNetwork.Mask.Size()

newOnes, _ := newNetwork.Mask.Size()

if !newNetwork.Contains(oldNetwork.IP) || newOnes > oldOnes {

return fmt.Errorf("new network %s is not a supernet of old network %s", newNetwork, oldNetwork)

}

ranges := &sna.v4ranges

if utilnet.IsIPv6(newNetwork.IP) {

ranges = &sna.v6ranges

}

for i, snr := range *ranges {

if snr.network.String() == oldNetwork.String() {

// Copy existing allocations into the new range.

for subnet, owner := range snr.allocMap {

newSnr.allocMap[subnet] = owner

}

newSnr.used = snr.used

// Preserve the allocation cursor. The enumeration order of IPs

// within the range is deterministic and stable across a widen

// (subnetBits increases but existing indices stay the same), so

// copying next avoids a redundant linear scan from 0.

newSnr.next = snr.next

(*ranges)[i] = newSnr

return nil

}

}

return fmt.Errorf("network range %s not found", oldNetwork)

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@go-controller/pkg/clustermanager/node/subnet_allocator.go` around lines 118 - 166, The comment in ReplaceNetworkRange incorrectly states "The allocation cursor (next) is reset to 0" while the implementation copies the cursor (newSnr.next = snr.next); update the comment to reflect that the allocation cursor is preserved to avoid redundant scans, and remove the assertion about resetting to 0 — reference ReplaceNetworkRange, newSnr.next, and snr.next to locate the code and adjust the explanatory text accordingly.

openshift-ci · 2026-06-11T14:15:03Z

@tssurya: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-gcp-ovn-techpreview	`42ec35c`	link	true	`/test e2e-gcp-ovn-techpreview`
ci/prow/e2e-metal-ipi-ovn-ipv6	`42ec35c`	link	true	`/test e2e-metal-ipi-ovn-ipv6`
ci/prow/security	`42ec35c`	link	false	`/test security`
ci/prow/e2e-aws-ovn-upgrade	`42ec35c`	link	true	`/test e2e-aws-ovn-upgrade`
ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration	`42ec35c`	link	true	`/test e2e-aws-ovn-shared-to-local-gateway-mode-migration`
ci/prow/e2e-aws-ovn-fdp-qe	`42ec35c`	link	true	`/test e2e-aws-ovn-fdp-qe`
ci/prow/e2e-aws-ovn-rhcos10-techpreview	`42ec35c`	link	false	`/test e2e-aws-ovn-rhcos10-techpreview`
ci/prow/unit	`42ec35c`	link	true	`/test unit`
ci/prow/e2e-aws-ovn-edge-zones	`42ec35c`	link	true	`/test e2e-aws-ovn-edge-zones`
ci/prow/lint	`42ec35c`	link	true	`/test lint`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci · 2026-06-16T08:49:52Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

tssurya and others added 10 commits June 11, 2026 10:49

openshift-ci Bot requested review from jcaamano and kyrtapz June 11, 2026 09:00

tssurya changed the title ~~Managed vtep mode test~~ [DNM] Managed vtep mode test Jun 11, 2026

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] Managed vtep mode test#3246

[DNM] Managed vtep mode test#3246
tssurya wants to merge 10 commits into
openshift:mainfrom
tssurya:managed-vtep-mode-test

tssurya commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

openshift-ci Bot commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tssurya commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

openshift-ci Bot commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tssurya commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading