NO-JIRA: Branch Sync release-4.22 to release-4.21 [06-10-2026]#3243
NO-JIRA: Branch Sync release-4.22 to release-4.21 [06-10-2026]#3243openshift-pr-manager[bot] wants to merge 195 commits into
Conversation
The test name used for DB dump artifact directories was only replacing spaces with underscores, leaving colons, brackets, parentheses and other shell/path-significant characters intact. This caused artifact collection failures when the test name contained characters like [Feature:NetworkConnect]. Use an allowlist regex to replace any run of non-alphanumeric characters (except dot, hyphen, underscore) with a single underscore, then trim leading/trailing underscores. Additionally remove the redundant [Feature:NetworkConnect] tag from the CNC toggling test since the parent Describe already uses the proper feature.NetworkConnect Ginkgo label. Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com> Made-with: Cursor
In DPU (Data Processing Unit) deployments, the OVN-Kubernetes namespace
may differ from the default "ovn-kubernetes" (e.g. when deployed via
operators that use a custom namespace). The ovnkube.sh entrypoint script
already reads OVN_KUBERNETES_NAMESPACE from the environment, but never
actually passed it through to the ovnkube binary via --ovn-config-namespace.
This caused DPU healthcheck failures because the ovnkube processes were
always looking for resources in the hardcoded "ovn-kubernetes" namespace,
while the actual resources (leases, services) lived in the
operator-configured namespace.
This change:
- Adds --ovn-config-namespace ${ovn_kubernetes_namespace} to all six
ovnkube binary invocations: ovn-master, ovnkube-controller,
ovnkube-controller-with-node, ovn-cluster-manager, ovn-node, and
cleanup-ovn-node.
- Replaces the hardcoded "ovn-kubernetes" namespace in ovnkube-identity
--extra-allowed-user service account references with the configurable
${ovn_kubernetes_namespace} variable.
Without this fix, DPU nodes fail healthchecks when deployed in a
non-default namespace because leader election leases target the wrong
namespace.
Signed-off-by: Igal Tsoiref <itsoiref@redhat.com>
Made-with: Cursor
KubeStellar Console provides a guided install mission for OVN-Kubernetes via an open-source Kubernetes dashboard. Discussed in #6124. Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Signed-off-by: nithyar <nithyar@nvidia.com>
Warning from the CI run: Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4, actions/setup-go@v5, actions/upload-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
When a UDN controller is recreated, cleanup() deletes pod-selector address sets directly from the NB DB via cleanupPolicyLogicalEntities. However, the shared AddressSetManager retains stale references. When the network is re-created, EnsureAddressSet finds the cached entry and reuses the dead UUID, causing permanent "object not found" errors on SetAddresses. Add AddressSetManager.CleanupForController() which destroys address sets owned by the network. Signed-off-by: Patryk Diak <pdiak@redhat.com>
When the node-subnets annotation transitions from absent to present (or vice versa), json.Unmarshal on the empty string fails and the function incorrectly returns false. This causes the node tracker to miss the update, leaving the node out of the service load balancer configuration. Signed-off-by: Yun Zhou <yunz@nvidia.com>
Fix shared AddressSetManager stale cache on UDN controller recreation
Enable address set manager to select pod IPs by nodeSelector. Signed-off-by: Xiaobin Qu <xqu@nvidia.com>
Signed-off-by: Tim Rozet <trozet@nvidia.com>
NodeControllerManager can call Gateway.Reconcile() during startup failure cleanup before gateway initialization has fully completed. Guard gateway.Reconcile() and updateSNATRules() against nil nodeIPManager/mgmtPort so shutdown does not panic on partially initialized gateways. Signed-off-by: Tim Rozet <trozet@nvidia.com>
During dual-stack conversion the checks to see if node-subnet need to be allocated incorrectly return false, because it was only checking or the presence of the annotation. Change this check to make sure the correct ip families are allocated. Signed-off-by: Tim Rozet <trozet@nvidia.com>
There can be cases where we partially configure a node, but then configuration fails and the node object is not stored in the nodeCache. Storing it in the nodeCache proactively can break retries where we expect we already configured the thing in the cache. Rather than overload the same cache with multiple meanings, create a new cache for holding the latest informer object. This is used mainly in a delete case where the node object is gone from the informer, and it was only partially configured (never added to the node cache). Signed-off-by: Tim Rozet <trozet@nvidia.com>
When a mgmt ip or gw ip would change, we would miss updating the address set. Signed-off-by: Tim Rozet <trozet@nvidia.com>
Move check_ipv6(), set_cluster_cidr_ip_families(), and docker_create_second_interface() from kind.sh to kind-common.sh so they can be called from kind-helm.sh. Add -i6/--ipv6 and -n4/--no-ipv4 flags to kind-helm.sh and pass gateway mode, snat, forwarding through helm --set. Hardcode unprivileged-mode=false to match kind.sh behavior. Escape dual-stack CIDRs to prevent helm from splitting on commas. Fix values-multi-node-zone.yaml missing value Update some CI targets to use kind-helm script, other targets are not yet supported, will be migrated later. Signed-off-by: Mykola Yurchenko <myurchenko@nvidia.com>
The IP subnet allocator silently accepted IPs not contained in any known subnet. Additionally, fix the admission webhook to tolerate missing default network annotations in case a UDN annotation comes first. Signed-off-by: Patryk Diak <pdiak@redhat.com>
Signed-off-by: Ayushi Chouhan <aychouha@aychouha-thinkpadp1gen4i.bengluru.csb>
github actions: update actions version due to deprecation of nodejs 20
Reduces churn for gatewayChanged function that would previously be triggered when UDNs spin up/down. Signed-off-by: Tim Rozet <trozet@nvidia.com>
The EndpointSlice mirror controller had several performance issues that became apparent at scale: - Use queue.Add instead of queue.AddRateLimited on the enqueue path. - Reorder syncDefaultEndpointSlice to perform the cheap resource-version check before the expensive NAD lookups. Most syncs short-circuit without ever calling GetPrimaryNADForNamespace. - Skip creation of empty mirrored EndpointSlices to avoid unnecessary API server writes. - Increase worker count from 1 to 5, matching the upstream kube-controller-manager default for EndpointSlice reconciliation. Co-Authored-By: liqcui <liqcui@redhat.com> Signed-off-by: Patryk Diak <pdiak@redhat.com>
Move CDN to node-level driven controller
Support nodeSelector in AddressSetManager
Website UI revamp
Add optional VF resources fields for UDN in helm charts and daemonsets
Improve EndpointSlice mirror controller scale bottlenecks
The fast-slow rate limiter delays even the first AddRateLimited call. Event handlers should enqueue items immediately with Add() and let the retry logic in handleErr use AddRateLimited for backoff. Signed-off-by: Patryk Diak <pdiak@redhat.com>
Fix subnet IP allocator to reject out-of-range IPs
When NoOverlay mode is used for a network, it uses learned route with proto bgp and that sets Node IP as source IP. 10.129.2.0/23 nhid 157 via 192.168.100.100 dev br-ex proto bgp metric 20 So, it is essential to add node IP to HostNetworkNamespace address_set to let host network POD use network-policy while using NoOverlay mode. Signed-off-by: Arnab Ghosh <arnabghosh89@gmail.com>
Fix DPU healthcheck by propagating namespace configuration to ovnkube
Let's use a helper function to determine which mode the node is in. Also we refactored to use positive if-statements to make it clear what the subsequent code is supposed to target towards. Signed-off-by: William Zhao <wizhao@redhat.com>
Automated branch sync bringing 192 commits from 5.0 into release-4.22. Merge conflict resolution: - File: dist/templates/ovnkube-control-plane.yaml.j2 - Conflict type: modify/delete - Upstream (5.0): File deleted in PR #6271 (helm migration removed all j2 templates) - Downstream (release-4.22): File modified with OCP-specific HACK for POD_NAME env var - Resolution: Accept deletion from upstream - Rationale: CNO maintains its own manifests in bindata; this j2 template is unused in OCP This was the only merge conflict. All other files merged cleanly with -X theirs strategy. Also added 'go mod tidy' changes to ./openshift/go.mod and ./openshift/go.sum
…-2026 [release-4.22] OCPBUGS-87214: Branch Sync main to release-4.22 [06-05-2026]
…4.22-to-release-4.21-06-10-2026
|
/ok-to-test |
|
@openshift-pr-manager[bot]: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-pr-manager[bot]: trigger 5 job(s) of type blocking for the ci release of OCP 4.21
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/817bae10-64c4-11f1-91a4-1fc1af6fa525-0 trigger 12 job(s) of type blocking for the nightly release of OCP 4.21
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/817bae10-64c4-11f1-91a4-1fc1af6fa525-1 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: openshift-pr-manager[bot] The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test e2e-aws-ovn-hypershift |
|
/payload-job periodic-ci-openshift-release-main-ci-4.21-upgrade-from-stable-4.20-e2e-azure-ovn-upgrade |
|
@jluhrsen: trigger 12 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/44960e60-64e1-11f1-984d-8b0d59202c1b-0 |
|
/retest-required |
|
/payload-job periodic-ci-openshift-release-main-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade |
|
@arkadeepsen: trigger 15 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6c1fb040-6657-11f1-8c11-01df2ec4644e-0 |
|
/test 4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade |
|
/payload-job periodic-ci-openshift-release-main-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade |
|
@arkadeepsen: trigger 14 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8c927440-666e-11f1-9d8e-f80cc5c7afe3-0 |
|
/test 4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade |
|
/payload-job periodic-ci-openshift-release-main-ci-4.21-upgrade-from-stable-4.20-e2e-azure-ovn-upgrade |
|
@jluhrsen: trigger 15 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8af37920-6751-11f1-978c-858b048d664e-0 |
|
@openshift-pr-manager[bot]: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@arkadeepsen: This PR was included in a payload test run from openshift/origin#31300
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5ddfe460-6881-11f1-8f18-58a0ded1aabf-0 |
|
Depends on openshift/origin#31300 |
|
/payload-job periodic-ci-openshift-release-main-ci-4.21-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade |
|
@jluhrsen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/335c6e60-69c7-11f1-9f44-129e01f68ef2-0 |
Automated branch sync: release-4.22 to release-4.21.