EVPN: add BFD failover e2e tests with dual-spine topology#3253
EVPN: add BFD failover e2e tests with dual-spine topology#3253anuragthehatter wants to merge 1 commit into
Conversation
WalkthroughAdds dual-spine EVPN BFD failover e2e testing infrastructure to ChangesEVPN BFD Dual-Spine Failover Tests
Sequence DiagramsequenceDiagram
participant Test as e2e Test
participant Docker
participant Spine2 as spine2 FRR
participant Spine1 as spine1 FRR
participant FRRk8s as FRR-K8s Pods
participant ClusterNodes
Test->>Docker: create spine2-net + spine2 container
Test->>Spine2: start bgpd + bfdd, configure BGP+EVPN sessions
Test->>ClusterNodes: attach to spine2-net, enable IPv6 forwarding
Test->>Spine1: configureBFDOnExternalFRR (spine2 peers)
Test->>Spine2: configureBFDOnExternalFRR (spine1 peers)
Test->>FRRk8s: waitForFRRK8sNeighbor (poll until spine2 IP appears)
Test->>FRRk8s: configureBFDOnFRRK8sPods (vtysh exec BFD config)
Note over Test: Pre-failure: verify BFD up on both spines, PfxRcd>0
Test->>ClusterNodes: setK8NodeLinkDown (simulate spine2 failure)
Test->>Spine2: verifyBFDState → down within ~5s, PfxRcd=0
Test->>Spine1: verifyBFDState → remains up
Note over Test: Verify post-failover connectivity and inter-VPN isolation
Test->>ClusterNodes: setK8NodeLinkUp (restore spine2)
Test->>Spine2: verifyBFDState → up (BFD recovered)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: anuragthehatter The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Cherry-picks EVPN BFD failover test from upstream PR ovn-kubernetes/ovn-kubernetes#6152. Adds 6 Serial/Informing e2e tests covering BFD-assisted failover with dual-spine topology across MAC-VRF, MAC-VRF+IP-VRF, and IP-VRF network types for both IPv4 and IPv6. Also fixes IPv6 IP-VRF connectivity by enabling IPv6 forwarding on the external FRR container before VRF creation. Without this, the kernel drops IPv6 packets transiting the VRF (VXLAN/SVI to eth1). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
190e6b3 to
5a4ffbb
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/evpn.go`:
- Around line 1847-1864: The verifyBFDState function's check for expectUp=false
is too lenient and treats missing peers the same as down peers. Modify the logic
to explicitly detect whether a BFD peer is present in the output. When expectUp
is false, verify that the peer exists in the response AND is explicitly in a
down state (check for "Status: down" or similar string), rather than just
confirming that "Status: up" is absent. Add an error return in verifyBFDState if
the peer cannot be found when expecting it to be down, to distinguish between a
legitimate down state and a missing peer configuration.
- Around line 1507-1518: Replace the hardcoded "kind" network name in the
GetNetwork call with the infra provider's method for obtaining the primary
network. Examine how getExternalFRRIP retrieves the network to understand the
correct approach that maintains the provider abstraction, then apply that same
pattern to the spine1 network retrieval to ensure the code works regardless of
the actual primary network name.
- Around line 1778-1796: The configureBFDOnFRRK8sPods function configures BFD
peers in the bfd block but does not bind BFD to the BGP neighbors, which is
required for FRRouting to trigger BGP session shutdown on BFD failures. After
the current vtyshArgs loop that configures BFD peers, append additional vtysh
commands to enter the router bgp context and add a neighbor binding command
(neighbor <spineIP> bfd) for each spine IP, then exit the router context. This
ensures both steps of FRR configuration are completed: the bfd peer definition
and the neighbor binding that activates BFD for BGP failover detection.
In `@test/e2e/route_advertisements.go`:
- Around line 2980-2994: The code only checks the first spine2 IP address using
dualSpine.spine2IPs[0], which in dual-stack runs can miss validating the other
IP family (IPv4 or IPv6). Filter dualSpine.spine2IPs by the active IP family and
iterate through all matching neighbors when calling getNeighborPfxRcd, ensuring
both IPv4 and IPv6 spine2 neighbors are validated for PfxRcd > 0. Apply this
same filtering pattern to all locations where spine2 neighbor validation occurs
to ensure consistent dual-stack coverage.
- Around line 3047-3052: The current error handling in the gomega.Eventually
block masks all errors from getNeighborPfxRcd as successful route withdrawals by
returning nil for any error. This incorrectly treats actual exec/parser/FRR
failures as acceptable outcomes. Create a sentinel error or custom typed error
in the getNeighborPfxRcd function to distinguish when the neighbor is genuinely
missing from the summary, then update the error handling logic in the Eventually
block to only return nil when this specific sentinel/typed error is returned,
allowing other error types to propagate and fail the test appropriately.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Enterprise
Run ID: 5c8874c4-dc32-4636-881f-4ad04507a32c
⛔ Files ignored due to path filters (1)
openshift/test/generated/zz_generated.annotations.gois excluded by!**/generated/**
📒 Files selected for processing (3)
openshift/test/tests.gotest/e2e/evpn.gotest/e2e/route_advertisements.go
| // Get spine1 IPs (existing FRR on kind network) for all supported families | ||
| kindNetwork, err := infraprovider.Get().GetNetwork("kind") | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get kind network: %w", err) | ||
| } | ||
| spine1NetInf, err := infraprovider.Get().GetExternalContainerNetworkInterface( | ||
| infraapi.ExternalContainer{Name: externalFRRContainerName}, kindNetwork) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get spine1 network interface: %w", err) | ||
| } | ||
| info.spine1IPs = matchIPStringsByIPFamilySet([]string{spine1NetInf.IPv4, spine1NetInf.IPv6}, ipFamilySet) | ||
|
|
There was a problem hiding this comment.
Use the infra provider’s primary network instead of hardcoding kind.
Line 1508 bypasses the provider abstraction already used by getExternalFRRIP, so this setup fails if the primary network is not literally named kind.
Proposed fix
- kindNetwork, err := infraprovider.Get().GetNetwork("kind")
+ kindNetwork, err := infraprovider.Get().PrimaryNetwork()
if err != nil {
- return nil, fmt.Errorf("failed to get kind network: %w", err)
+ return nil, fmt.Errorf("failed to get primary network: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Get spine1 IPs (existing FRR on kind network) for all supported families | |
| kindNetwork, err := infraprovider.Get().GetNetwork("kind") | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get kind network: %w", err) | |
| } | |
| spine1NetInf, err := infraprovider.Get().GetExternalContainerNetworkInterface( | |
| infraapi.ExternalContainer{Name: externalFRRContainerName}, kindNetwork) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get spine1 network interface: %w", err) | |
| } | |
| info.spine1IPs = matchIPStringsByIPFamilySet([]string{spine1NetInf.IPv4, spine1NetInf.IPv6}, ipFamilySet) | |
| // Get spine1 IPs (existing FRR on kind network) for all supported families | |
| kindNetwork, err := infraprovider.Get().PrimaryNetwork() | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get primary network: %w", err) | |
| } | |
| spine1NetInf, err := infraprovider.Get().GetExternalContainerNetworkInterface( | |
| infraapi.ExternalContainer{Name: externalFRRContainerName}, kindNetwork) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get spine1 network interface: %w", err) | |
| } | |
| info.spine1IPs = matchIPStringsByIPFamilySet([]string{spine1NetInf.IPv4, spine1NetInf.IPv6}, ipFamilySet) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/evpn.go` around lines 1507 - 1518, Replace the hardcoded "kind"
network name in the GetNetwork call with the infra provider's method for
obtaining the primary network. Examine how getExternalFRRIP retrieves the
network to understand the correct approach that maintains the provider
abstraction, then apply that same pattern to the spine1 network retrieval to
ensure the code works regardless of the actual primary network name.
| func configureBFDOnFRRK8sPods(cs clientset.Interface, spineIPs []string) error { | ||
| namespace := deploymentconfig.Get().FRRK8sNamespace() | ||
| pods, err := cs.CoreV1().Pods(namespace).List(context.Background(), metav1.ListOptions{ | ||
| LabelSelector: frrK8sDaemonLabel, | ||
| }) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to list FRR-K8s pods: %w", err) | ||
| } | ||
| if len(pods.Items) == 0 { | ||
| return fmt.Errorf("no FRR-K8s pods matched %q in namespace %q", frrK8sDaemonLabel, namespace) | ||
| } | ||
|
|
||
| var vtyshArgs []string | ||
| vtyshArgs = append(vtyshArgs, "configure terminal", "bfd") | ||
| for _, ip := range spineIPs { | ||
| vtyshArgs = append(vtyshArgs, fmt.Sprintf("peer %s", ip), "no shutdown", "exit") | ||
| } | ||
| vtyshArgs = append(vtyshArgs, "exit", "end") | ||
| cmd := vtyshCommand(vtyshArgs...) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Search for BFD-to-BGP neighbor binding in FRR-K8s-related config paths.
# Expect: either FRRConfiguration bfdProfile usage for these neighbors, or vtysh config
# that enters router bgp and applies `neighbor <spineIP> bfd`.
rg -nP -C3 'bfdProfile|neighbor\s+[^"\n]+\s+bfd|configureBFDOnFRRK8sPods|createFRRConfiguration' \
--glob '*.go' --glob '*.yaml' --glob '*.yml' --glob '*.sh'Repository: openshift/ovn-kubernetes
Length of output: 6539
🌐 Web query:
FRRouting BGP neighbor bfd command requirement for binding a BFD peer to a BGP session
💡 Result:
In FRRouting (FRR), binding a BFD (Bidirectional Forwarding Detection) session to a BGP neighbor is a two-step process requiring configuration in both the BFD daemon (bfdd) and the BGP daemon (bgpd) [1][2]. 1. Configure the BFD peer: You must first define the corresponding peer within the bfd configuration node [1][3]. This ensures the BFD daemon monitors the specified target [1][4]. bfd peer 192.168.0.1 no shutdown!! 2. Enable BFD for the BGP neighbor: Once the BFD peer is configured, you must enable BFD monitoring for the specific BGP neighbor within the BGP router configuration [1][3]. router bgp 65530 neighbor 192.168.0.1 remote-as 65531 neighbor 192.168.0.1 bfd! When the neighbor <A.B.C.D|X:X::X:X|WORD> bfd command is configured, BGP listens for BFD events on the same target [1][5]. If the BFD session goes down, it triggers an immediate shutdown of the BGP session; when the BFD session recovers, it notifies BGP to re-establish the connection [1][2]. Additional options include: - Strict Mode: Use neighbor <peer> bfd strict to prevent BGP from establishing a session until the BFD session is confirmed to be up [1][4]. - Hold-time: You can add a hold-time to the strict mode command to specify a delay before BGP is notified of a BFD down event, which can help prevent BGP session flapping [4]. - Control Plane Failure: The neighbor <peer> bfd check-control-plane-failure command allows BGP to read BFD C-BIT information, which is useful when coordinating with Graceful Restart [1][3].
Citations:
- 1: https://docs.frrouting.org/en/latest/bfd.html
- 2: https://docs.frrouting.org/en/stable-10.0/bfd.html
- 3: https://docs.frrouting.org/en/stable-8.4/bfd.html
- 4: https://github.com/FRRouting/frr/blob/master/doc/user/bfd.rst
- 5: https://github.com/FRRouting/frr/blob/stable/7.3/doc/user/bfd.rst
🏁 Script executed:
cat -n test/e2e/evpn.go | sed -n '1675,1810p'Repository: openshift/ovn-kubernetes
Length of output: 6406
🏁 Script executed:
sed -n '1570,1610p' test/e2e/evpn.goRepository: openshift/ovn-kubernetes
Length of output: 1444
Add neighbor <spineIP> bfd BGP neighbor binding in FRR-K8s pods.
The function injects BFD peer entries into the bfd block but does not bind BFD to the BGP neighbors in the FRR-K8s pods. FRRouting requires both steps: (1) a bfd peer configuration in the bfd block, and (2) a neighbor <ip> bfd command under router bgp to actually trigger BGP session shutdown when BFD detects a session failure. Without the neighbor binding on the FRR-K8s side, only the external spines can trigger BFD-assisted withdrawal when the link fails—the FRR-K8s pods will remain unaware of the BFD state change. To properly test BFD-assisted failover, add the neighbor binding via vtysh injection in the same function (e.g., within router bgp, add neighbor <spineIP> bfd for each spine IP).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/evpn.go` around lines 1778 - 1796, The configureBFDOnFRRK8sPods
function configures BFD peers in the bfd block but does not bind BFD to the BGP
neighbors, which is required for FRRouting to trigger BGP session shutdown on
BFD failures. After the current vtyshArgs loop that configures BFD peers, append
additional vtysh commands to enter the router bgp context and add a neighbor
binding command (neighbor <spineIP> bfd) for each spine IP, then exit the router
context. This ensures both steps of FRR configuration are completed: the bfd
peer definition and the neighbor binding that activates BFD for BGP failover
detection.
| func verifyBFDState(containerName string, peerIPs []string, expectUp bool) error { | ||
| frr := infraapi.ExternalContainer{Name: containerName} | ||
| for _, ip := range peerIPs { | ||
| res, err := infraprovider.Get().ExecExternalContainerCommand(frr, | ||
| vtyshCommand(fmt.Sprintf("show bfd peer %s", ip))) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to check BFD peer %s on %s: %w", ip, containerName, err) | ||
| } | ||
| isUp := strings.Contains(res, "Status: up") | ||
| if expectUp && !isUp { | ||
| return fmt.Errorf("BFD peer %s on %s: expected up, got down", ip, containerName) | ||
| } | ||
| if !expectUp && isUp { | ||
| return fmt.Errorf("BFD peer %s on %s: expected down, got up", ip, containerName) | ||
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Don’t treat a missing BFD peer as “down”.
For expectUp=false, the current check passes on any output that lacks Status: up, including “peer not found”. That can hide BFD config loss instead of proving the peer transitioned down.
Proposed fix
- isUp := strings.Contains(res, "Status: up")
+ normalized := strings.ToLower(res)
+ if strings.Contains(normalized, "not found") || strings.Contains(normalized, "no bfd") {
+ return fmt.Errorf("BFD peer %s on %s not found", ip, containerName)
+ }
+ isUp := strings.Contains(normalized, "status: up")
+ isDown := strings.Contains(normalized, "status: down")
if expectUp && !isUp {
return fmt.Errorf("BFD peer %s on %s: expected up, got down", ip, containerName)
}
- if !expectUp && isUp {
- return fmt.Errorf("BFD peer %s on %s: expected down, got up", ip, containerName)
+ if !expectUp && !isDown {
+ return fmt.Errorf("BFD peer %s on %s: expected down, got output: %s", ip, containerName, res)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func verifyBFDState(containerName string, peerIPs []string, expectUp bool) error { | |
| frr := infraapi.ExternalContainer{Name: containerName} | |
| for _, ip := range peerIPs { | |
| res, err := infraprovider.Get().ExecExternalContainerCommand(frr, | |
| vtyshCommand(fmt.Sprintf("show bfd peer %s", ip))) | |
| if err != nil { | |
| return fmt.Errorf("failed to check BFD peer %s on %s: %w", ip, containerName, err) | |
| } | |
| isUp := strings.Contains(res, "Status: up") | |
| if expectUp && !isUp { | |
| return fmt.Errorf("BFD peer %s on %s: expected up, got down", ip, containerName) | |
| } | |
| if !expectUp && isUp { | |
| return fmt.Errorf("BFD peer %s on %s: expected down, got up", ip, containerName) | |
| } | |
| } | |
| return nil | |
| } | |
| func verifyBFDState(containerName string, peerIPs []string, expectUp bool) error { | |
| frr := infraapi.ExternalContainer{Name: containerName} | |
| for _, ip := range peerIPs { | |
| res, err := infraprovider.Get().ExecExternalContainerCommand(frr, | |
| vtyshCommand(fmt.Sprintf("show bfd peer %s", ip))) | |
| if err != nil { | |
| return fmt.Errorf("failed to check BFD peer %s on %s: %w", ip, containerName, err) | |
| } | |
| normalized := strings.ToLower(res) | |
| if strings.Contains(normalized, "not found") || strings.Contains(normalized, "no bfd") { | |
| return fmt.Errorf("BFD peer %s on %s not found", ip, containerName) | |
| } | |
| isUp := strings.Contains(normalized, "status: up") | |
| isDown := strings.Contains(normalized, "status: down") | |
| if expectUp && !isUp { | |
| return fmt.Errorf("BFD peer %s on %s: expected up, got down", ip, containerName) | |
| } | |
| if !expectUp && !isDown { | |
| return fmt.Errorf("BFD peer %s on %s: expected down, got output: %s", ip, containerName, res) | |
| } | |
| } | |
| return nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/evpn.go` around lines 1847 - 1864, The verifyBFDState function's
check for expectUp=false is too lenient and treats missing peers the same as
down peers. Modify the logic to explicitly detect whether a BFD peer is present
in the output. When expectUp is false, verify that the peer exists in the
response AND is explicitly in a down state (check for "Status: down" or similar
string), rather than just confirming that "Status: up" is absent. Add an error
return in verifyBFDState if the peer cannot be found when expecting it to be
down, to distinguish between a legitimate down state and a missing peer
configuration.
| ginkgo.By("Verifying spine2 is providing EVPN routes (PfxRcd > 0)") | ||
| spine2NeighborIP := dualSpine.spine2IPs[0] | ||
| for _, pod := range frrk8sPods.Items { | ||
| gomega.Eventually(func() error { | ||
| pfxRcd, err := getNeighborPfxRcd(frrk8sNamespace, pod.Name, frrK8sContainerName, spine2NeighborIP) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if pfxRcd == 0 { | ||
| return fmt.Errorf("spine2 neighbor %s has PfxRcd=0 on pod %s", spine2NeighborIP, pod.Name) | ||
| } | ||
| return nil | ||
| }).WithTimeout(30*time.Second).WithPolling(2*time.Second).Should(gomega.Succeed(), | ||
| fmt.Sprintf("spine2 (%s) should have PfxRcd > 0 on pod %s before failure injection", spine2NeighborIP, pod.Name)) | ||
| } |
There was a problem hiding this comment.
Check the spine2 neighbor for the active IP family.
dualSpine.spine2IPs[0] only validates the first spine2 BGP neighbor, so dual-stack runs can pass the IPv6 entry while only checking the IPv4 neighbor. Filter dualSpine.spine2IPs by family and assert PfxRcd/withdrawal for each matching neighbor.
Proposed adjustment
- spine2NeighborIP := dualSpine.spine2IPs[0]
+ var spine2NeighborIPs []string
+ for _, spine2IP := range dualSpine.spine2IPs {
+ if (family == utilnet.IPv6) == utilnet.IsIPv6String(spine2IP) {
+ spine2NeighborIPs = append(spine2NeighborIPs, spine2IP)
+ }
+ }
+ gomega.Expect(spine2NeighborIPs).NotTo(gomega.BeEmpty(),
+ "spine2 should have a neighbor IP for %v", family)
for _, pod := range frrk8sPods.Items {
- gomega.Eventually(func() error {
- pfxRcd, err := getNeighborPfxRcd(frrk8sNamespace, pod.Name, frrK8sContainerName, spine2NeighborIP)
- if err != nil {
- return err
- }
- if pfxRcd == 0 {
- return fmt.Errorf("spine2 neighbor %s has PfxRcd=0 on pod %s", spine2NeighborIP, pod.Name)
- }
- return nil
- }).WithTimeout(30*time.Second).WithPolling(2*time.Second).Should(gomega.Succeed(),
- fmt.Sprintf("spine2 (%s) should have PfxRcd > 0 on pod %s before failure injection", spine2NeighborIP, pod.Name))
+ for _, spine2NeighborIP := range spine2NeighborIPs {
+ gomega.Eventually(func() error {
+ pfxRcd, err := getNeighborPfxRcd(frrk8sNamespace, pod.Name, frrK8sContainerName, spine2NeighborIP)
+ if err != nil {
+ return err
+ }
+ if pfxRcd == 0 {
+ return fmt.Errorf("spine2 neighbor %s has PfxRcd=0 on pod %s", spine2NeighborIP, pod.Name)
+ }
+ return nil
+ }).WithTimeout(30*time.Second).WithPolling(2*time.Second).Should(gomega.Succeed(),
+ fmt.Sprintf("spine2 (%s) should have PfxRcd > 0 on pod %s before failure injection", spine2NeighborIP, pod.Name))
+ }
}Apply the same for _, spine2NeighborIP := range spine2NeighborIPs loop around the withdrawal assertion at lines 3046-3059.
Also applies to: 3045-3059
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/route_advertisements.go` around lines 2980 - 2994, The code only
checks the first spine2 IP address using dualSpine.spine2IPs[0], which in
dual-stack runs can miss validating the other IP family (IPv4 or IPv6). Filter
dualSpine.spine2IPs by the active IP family and iterate through all matching
neighbors when calling getNeighborPfxRcd, ensuring both IPv4 and IPv6 spine2
neighbors are validated for PfxRcd > 0. Apply this same filtering pattern to all
locations where spine2 neighbor validation occurs to ensure consistent
dual-stack coverage.
| gomega.Eventually(func() error { | ||
| pfxRcd, err := getNeighborPfxRcd(frrk8sNamespace, pod.Name, frrK8sContainerName, spine2NeighborIP) | ||
| if err != nil { | ||
| // Neighbor gone from summary entirely is also acceptable | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Do not treat every PfxRcd read error as route withdrawal.
This turns exec/parser/FRR failures into a passing withdrawal check. If “neighbor missing” is acceptable, make getNeighborPfxRcd return a distinguishable sentinel/typed error and only ignore that case.
Proposed shape
pfxRcd, err := getNeighborPfxRcd(frrk8sNamespace, pod.Name, frrK8sContainerName, spine2NeighborIP)
if err != nil {
- // Neighbor gone from summary entirely is also acceptable
- return nil
+ if errors.Is(err, errNeighborNotFound) {
+ // Neighbor gone from summary entirely is also acceptable.
+ return nil
+ }
+ return fmt.Errorf("failed to read PfxRcd for spine2 neighbor %s on pod %s: %w",
+ spine2NeighborIP, pod.Name, err)
}This needs a matching sentinel from getNeighborPfxRcd.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/route_advertisements.go` around lines 3047 - 3052, The current error
handling in the gomega.Eventually block masks all errors from getNeighborPfxRcd
as successful route withdrawals by returning nil for any error. This incorrectly
treats actual exec/parser/FRR failures as acceptable outcomes. Create a sentinel
error or custom typed error in the getNeighborPfxRcd function to distinguish
when the neighbor is genuinely missing from the summary, then update the error
handling logic in the Eventually block to only return nil when this specific
sentinel/typed error is returned, allowing other error types to propagate and
fail the test appropriately.
|
/test e2e-metal-ipi-ovn-dualstack-bgp-local-gw |
|
@anuragthehatter: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Files changed
test/e2e/evpn.go— 9 new functions + IPv6 forwarding fix insetupIPVRFOnExternalFRRtest/e2e/route_advertisements.go— BFD failoverDescribeTableSubtreewith 3 network types × 2 IP families = 6 test casesopenshift/test/generated/zz_generated.annotations.go— 6 new Serial test annotationsopenshift/test/tests.go— 6 new InformingTests entriesTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit