Skip to content

Add netkit interface support#888

Open
msherif1234 wants to merge 3 commits into
mainfrom
netkit_support
Open

Add netkit interface support#888
msherif1234 wants to merge 3 commits into
mainfrom
netkit_support

Conversation

@msherif1234

@msherif1234 msherif1234 commented Feb 16, 2026

Copy link
Copy Markdown
Contributor

Description

This PR adds netkit support to the NetObserv eBPF agent so it can monitor clusters using netkit-based container networking
netkit in kernel 6.7

references:

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

Summary by CodeRabbit

  • New Features

    • Monitor netkit interfaces' flows for primary (egress) and peer (ingress) endpoints.
    • Capture packet payloads for netkit traffic across x86, ARM, PowerPC and s390 platforms.
    • Attach netkit hooks inside interface network namespaces for reliable per-interface handling.
  • Chores

    • Added netkit program support, extended eBPF bindings and updated loader/cleanup workflows for safer lifecycle management.

Review Change Stack

@openshift-ci

openshift-ci Bot commented Feb 16, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign memodi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@msherif1234 msherif1234 changed the title WIP: Add netkit interface support Add netkit interface support Mar 19, 2026
@msherif1234 msherif1234 added the needs-review Tells that the PR needs a review label Mar 19, 2026
@msherif1234 msherif1234 requested a review from jotak March 19, 2026 19:45
@coderabbitai

coderabbitai Bot commented Apr 6, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds netkit support: new netkit eBPF attach points and PCA entrypoints, netkit constants in kernel headers, Go program bindings for four netkit programs across architectures, and tracer/runtime changes to load, attach, and track per-interface netkit links (namespace-aware) when the kernel supports netkit.

Changes

Netkit eBPF + runtime integration

Layer / File(s) Summary
Netkit eBPF entrypoints
bpf/flows.c, bpf/pca.h
Added SEC("netkit/primary") and SEC("netkit/peer") entrypoints: netkit_primary_flow_parseflow_monitor(skb, EGRESS) / return NETKIT_NEXT; netkit_peer_flow_parseflow_monitor(skb, INGRESS) / return NETKIT_NEXT; netkit_primary_pca_parse/netkit_peer_pca_parse call export_packet_payload(..., EGRESS/INGRESS) and return NETKIT_NEXT.
BPF loader refactor and netkit-capable loading
pkg/tracer/tracer.go (loader sections)
Refactors BPF loader into helpers (deletePrograms, loadAndAssignPinned, makeBpfObjects), adds loadObjectsWithNetkit and a supportNetkit dispatch path used by NewFlowFetcher.
FlowFetcher: register/attach/teardown for netkit
pkg/tracer/tracer.go
Adds per-interface netkitPrimaryLinks/netkitPeerLinks, withNetNS helper, registerNetkit, detects *netlink.Netkit in Register/AttachTCX, routes to netkit attach, and closes/clears links in DetachTCX/UnRegister/Close.
PacketFetcher: register/attach/teardown for netkit PCA
pkg/tracer/tracer.go
Adds per-interface netkit PCA link maps and wiring to populate netkit PCA program pointers during load, detects netkit devices and attaches PCA parse hooks via registerNetkit, and explicitly detaches/clears links on DetachTCX/UnRegister.
Go eBPF bindings (program specs & programs)
pkg/ebpf/bpf_*_bpfel.go (x86, arm64, powerpc, s390)
Added four new fields to BpfProgramSpecs and BpfPrograms: NetkitPeerFlowParse, NetkitPeerPcaParse, NetkitPrimaryFlowParse, NetkitPrimaryPcaParse with matching ebpf tags; updated (*BpfPrograms).Close() to close them.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive Description provided explains netkit support addition and kernel version (6.7) with references, but the repository checklist remains uncompleted and QE requirements are not explicitly selected. Complete the checklist by selecting one QE requirement option and addressing any product/documentation/release notes implications relevant to this netkit feature addition.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: adding netkit interface support to the eBPF agent. Concise and directly reflects the PR's primary objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch netkit_support

Warning

Review ran into problems

🔥 Problems

Linked repositories: Your configuration references 2 linked repositories, but your current plan allows 1. Analyzed netobserv/netobserv-operator, skipped netobserv/flowlogs-pipeline.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/tracer/tracer.go`:
- Around line 213-214: The boolean passed as supportNetkit is inverted:
kernel.IsKernelOlderThan("6.7.0") returns true for kernels older than 6.7.0 but
the code currently treats that true as "support netkit" — flip the boolean
before passing it into kernelSpecificLoadAndAssign (i.e., compute supportNetkit
:= !kernel.IsKernelOlderThan("6.7.0") or otherwise negate it) so that 6.7+
kernels enable netkit support correctly; update the same pattern wherever
supportNetkit is derived from kernel.IsKernelOlderThan (including the other
occurrence using the same variables/logic).
- Around line 1979-1981: The code currently calls registerInterface(...) which
creates a clsact qdisc even for netkit devices that then take the early return
via p.registerNetkit(iface), leaving that qdisc untracked; move the netkit
detection (the isNetkit check on ipvlan) before any call to registerInterface
(or alter registerInterface to skip qdisc creation when the device is netkit) so
that for netkit paths you call p.registerNetkit(iface) immediately and never
create/leave an untracked clsact qdisc; update references to registerInterface
and registerNetkit accordingly.
- Around line 555-570: withNetNS can migrate the goroutine between OS threads
during namespace switches; call runtime.LockOSThread() at the start of withNetNS
to pin the goroutine to the current thread, then call netns.Get() while locked;
if netns.Get() fails unlock the thread before returning; on success defer
restoring the original namespace (netns.Set(originalNS)), closing originalNS,
and runtime.UnlockOSThread() so the thread is unpinned after fn() completes.
Ensure the Lock/Unlock pairing surrounds the netns.Get(), the Setns() call using
targetNS, and the final fn() invocation to prevent goroutine migration during
namespace operations.
- Around line 1815-1820: NewPacketFetcher currently attempts to load
NetkitPrimaryPcaParse and NetkitPeerPcaParse unconditionally which breaks on
kernels < 6.7; before calling LoadAndAssign(...) in NewPacketFetcher, check the
kernel version the same way flow-program loading is gated and if kernel < 6.7
remove/skip NetkitPrimaryPcaParse and NetkitPeerPcaParse from the program
collection (the same place you handle
TcEgressPcaParse/TcIngressPcaParse/TcxEgressPcaParse/TcxIngressPcaParse gating)
so LoadAndAssign is never asked to load those netkit programs on unsupported
kernels.
- Around line 1420-1421: The struct definitions used in
loadObjectsOldKernelRtKernel, loadObjectsOldKernel, loadObjectsRtKernel, and
loadObjectsNoNetworkEvents incorrectly include the NetkitPrimaryFlowParse and
NetkitPeerFlowParse fields (tagged `ebpf:"netkit_..."`), causing LoadAndAssign
to attempt loading netkit programs when they aren't present; remove the
NetkitPrimaryFlowParse and NetkitPeerFlowParse fields from those helper structs
so only the loadObjectsWithNetkit path declares them, and ensure LoadAndAssign
is only invoked on structs that actually need those programs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fcd90196-7fdf-4e53-b9e2-8f5b0869c086

📥 Commits

Reviewing files that changed from the base of the PR and between 7b86af3 and cee7264.

⛔ Files ignored due to path filters (4)
  • pkg/ebpf/bpf_arm64_bpfel.o is excluded by !**/*.o
  • pkg/ebpf/bpf_powerpc_bpfel.o is excluded by !**/*.o
  • pkg/ebpf/bpf_s390_bpfeb.o is excluded by !**/*.o
  • pkg/ebpf/bpf_x86_bpfel.o is excluded by !**/*.o
📒 Files selected for processing (11)
  • bpf/flows.c
  • bpf/headers/vmlinux_amd64.h
  • bpf/headers/vmlinux_arm64.h
  • bpf/headers/vmlinux_ppc64le.h
  • bpf/headers/vmlinux_s390.h
  • bpf/pca.h
  • pkg/ebpf/bpf_arm64_bpfel.go
  • pkg/ebpf/bpf_powerpc_bpfel.go
  • pkg/ebpf/bpf_s390_bpfeb.go
  • pkg/ebpf/bpf_x86_bpfel.go
  • pkg/tracer/tracer.go

Comment thread pkg/tracer/tracer.go
Comment on lines +555 to +570
func withNetNS(targetNS netns.NsHandle, fn func() error) error {
if targetNS == netns.None() {
return fn()
}
originalNS, err := netns.Get()
if err != nil {
return fmt.Errorf("failed to get current netns: %w", err)
}
defer func() {
_ = netns.Set(originalNS)
originalNS.Close()
}()
if err := unix.Setns(int(targetNS), unix.CLONE_NEWNET); err != nil {
return fmt.Errorf("failed to setns to %s: %w", targetNS, err)
}
return fn()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and examine the code at the specified location
head -600 pkg/tracer/tracer.go | tail -100 | cat -n

Repository: netobserv/netobserv-ebpf-agent

Length of output: 4378


🏁 Script executed:

# Search for runtime.LockOSThread usage in the codebase
rg "LockOSThread" --type go

Repository: netobserv/netobserv-ebpf-agent

Length of output: 813


🏁 Script executed:

# Search for netns.Set and netns.Get usage patterns
rg "netns\.(Set|Get)" --type go -B 2 -A 2

Repository: netobserv/netobserv-ebpf-agent

Length of output: 3865


🏁 Script executed:

# Check for any tests or comments about this function
rg "withNetNS" --type go -B 2 -A 5

Repository: netobserv/netobserv-ebpf-agent

Length of output: 1468


Add runtime.LockOSThread() to prevent goroutine migration across namespace operations.

The netns vendor library documentation explicitly requires thread locking for namespace operations. Without runtime.LockOSThread(), the goroutine can migrate to a different OS thread between Setns() and fn(), causing eBPF program attachment or other operations to execute in the wrong namespace.

Suggested fix
 import (
 	"errors"
 	"fmt"
 	"io/fs"
 	"os"
 	"path"
+	"runtime"
 	"strings"
 	"time"
@@
 func withNetNS(targetNS netns.NsHandle, fn func() error) error {
 	if targetNS == netns.None() {
 		return fn()
 	}
+	runtime.LockOSThread()
+	defer runtime.UnlockOSThread()
 	originalNS, err := netns.Get()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func withNetNS(targetNS netns.NsHandle, fn func() error) error {
if targetNS == netns.None() {
return fn()
}
originalNS, err := netns.Get()
if err != nil {
return fmt.Errorf("failed to get current netns: %w", err)
}
defer func() {
_ = netns.Set(originalNS)
originalNS.Close()
}()
if err := unix.Setns(int(targetNS), unix.CLONE_NEWNET); err != nil {
return fmt.Errorf("failed to setns to %s: %w", targetNS, err)
}
return fn()
func withNetNS(targetNS netns.NsHandle, fn func() error) error {
if targetNS == netns.None() {
return fn()
}
runtime.LockOSThread()
defer runtime.UnlockOSThread()
originalNS, err := netns.Get()
if err != nil {
return fmt.Errorf("failed to get current netns: %w", err)
}
defer func() {
_ = netns.Set(originalNS)
originalNS.Close()
}()
if err := unix.Setns(int(targetNS), unix.CLONE_NEWNET); err != nil {
return fmt.Errorf("failed to setns to %s: %w", targetNS, err)
}
return fn()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tracer/tracer.go` around lines 555 - 570, withNetNS can migrate the
goroutine between OS threads during namespace switches; call
runtime.LockOSThread() at the start of withNetNS to pin the goroutine to the
current thread, then call netns.Get() while locked; if netns.Get() fails unlock
the thread before returning; on success defer restoring the original namespace
(netns.Set(originalNS)), closing originalNS, and runtime.UnlockOSThread() so the
thread is unpinned after fn() completes. Ensure the Lock/Unlock pairing
surrounds the netns.Get(), the Setns() call using targetNS, and the final fn()
invocation to prevent goroutine migration during namespace operations.

Comment thread pkg/tracer/tracer.go
Comment thread pkg/tracer/tracer.go
Comment thread pkg/tracer/tracer.go Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/tracer/tracer.go (1)

2277-2348: ⚠️ Potential issue | 🟠 Major

Missing netkit link cleanup in PacketFetcher.Close().

FlowFetcher.Close() cleans up netkit links (lines 902-917), but PacketFetcher.Close() doesn't. Add cleanup for netkitPrimaryLink and netkitPeerLink maps.

 func (p *PacketFetcher) Close() error {
 	log.Debug("unregistering eBPF objects")

 	var errs []error
+	for _, l := range p.netkitPrimaryLink {
+		if l == nil {
+			continue
+		}
+		if err := l.Close(); err != nil {
+			errs = append(errs, err)
+		}
+	}
+	for _, l := range p.netkitPeerLink {
+		if l == nil {
+			continue
+		}
+		if err := l.Close(); err != nil {
+			errs = append(errs, err)
+		}
+	}
 	if p.perfReader != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tracer/tracer.go` around lines 2277 - 2348, PacketFetcher.Close() is
missing cleanup for the netkit links; iterate over p.netkitPrimaryLink and
p.netkitPeerLink (like FlowFetcher.Close() does), log per-interface detach
messages, call Close() on each link value (nil-check as needed), and then reset
p.netkitPrimaryLink and p.netkitPeerLink to empty maps; place this cleanup
alongside the other link teardown (e.g., near the egressTCXLink/ingressTCXLink
cleanup) so all netkit links are closed before returning.
♻️ Duplicate comments (1)
pkg/tracer/tracer.go (1)

1415-1471: ⚠️ Potential issue | 🟡 Minor

Potential resource leak: netkit programs loaded then discarded.

The newBpfPrograms struct includes netkit program fields with ebpf: tags. If netkit programs exist in the spec, LoadAndAssign will load them into newObjects, but makeBpfObjects then sets them to nil (lines 1450-1451), leaking the loaded programs.

Either remove the netkit fields from this struct, or delete netkit programs from spec before loading:

 func loadObjectsOldKernelRtKernel(spec *cilium.CollectionSpec, pinDir string) (ebpf.BpfObjects, error) {
 	type newBpfPrograms struct {
 		TcEgressFlowParse      *cilium.Program `ebpf:"tc_egress_flow_parse"`
 		TcIngressFlowParse     *cilium.Program `ebpf:"tc_ingress_flow_parse"`
-		NetkitPrimaryFlowParse *cilium.Program `ebpf:"netkit_primary_flow_parse"`
-		NetkitPeerFlowParse    *cilium.Program `ebpf:"netkit_peer_flow_parse"`
 		TcxEgressFlowParse     *cilium.Program `ebpf:"tcx_egress_flow_parse"`

Same pattern applies to loadObjectsOldKernel, loadObjectsRtKernel, and loadObjectsNoNetworkEvents.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tracer/tracer.go` around lines 1415 - 1471, The netkit programs
(NetkitPrimaryFlowParse, NetkitPeerFlowParse) are defined in newBpfPrograms so
loadAndAssignPinned will load them but makeBpfObjects discards them, causing a
resource leak; either remove those netkit fields from the newBpfPrograms type so
they are never loaded, or ensure they are removed from the spec before loading
(e.g., call deletePrograms(...) to drop netkit hooks) so loadAndAssignPinned
won't open them; apply the same fix pattern to loadObjectsOldKernel,
loadObjectsRtKernel, and loadObjectsNoNetworkEvents, and verify references to
NetkitPrimaryFlowParse and NetkitPeerFlowParse are consistent with the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/tracer/tracer.go`:
- Around line 2277-2348: PacketFetcher.Close() is missing cleanup for the netkit
links; iterate over p.netkitPrimaryLink and p.netkitPeerLink (like
FlowFetcher.Close() does), log per-interface detach messages, call Close() on
each link value (nil-check as needed), and then reset p.netkitPrimaryLink and
p.netkitPeerLink to empty maps; place this cleanup alongside the other link
teardown (e.g., near the egressTCXLink/ingressTCXLink cleanup) so all netkit
links are closed before returning.

---

Duplicate comments:
In `@pkg/tracer/tracer.go`:
- Around line 1415-1471: The netkit programs (NetkitPrimaryFlowParse,
NetkitPeerFlowParse) are defined in newBpfPrograms so loadAndAssignPinned will
load them but makeBpfObjects discards them, causing a resource leak; either
remove those netkit fields from the newBpfPrograms type so they are never
loaded, or ensure they are removed from the spec before loading (e.g., call
deletePrograms(...) to drop netkit hooks) so loadAndAssignPinned won't open
them; apply the same fix pattern to loadObjectsOldKernel, loadObjectsRtKernel,
and loadObjectsNoNetworkEvents, and verify references to NetkitPrimaryFlowParse
and NetkitPeerFlowParse are consistent with the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 04cf8c4f-da9a-4d1b-bb0f-e4917697cf38

📥 Commits

Reviewing files that changed from the base of the PR and between cee7264 and 67a7fea.

⛔ Files ignored due to path filters (12)
  • bpf/headers/vmlinux_amd64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_arm64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_ppc64le.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_s390.h is excluded by !**/vmlinux*.h
  • pkg/ebpf/bpf_arm64_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_arm64_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
  • pkg/ebpf/bpf_powerpc_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_powerpc_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
  • pkg/ebpf/bpf_s390_bpfeb.go is excluded by !**/*_bpfeb.go
  • pkg/ebpf/bpf_s390_bpfeb.o is excluded by !**/*.o, !**/*_bpfeb.o
  • pkg/ebpf/bpf_x86_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_x86_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
📒 Files selected for processing (3)
  • bpf/flows.c
  • bpf/pca.h
  • pkg/tracer/tracer.go
✅ Files skipped from review due to trivial changes (1)
  • bpf/pca.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • bpf/flows.c

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/tracer/tracer.go (1)

2277-2348: ⚠️ Potential issue | 🟠 Major

PacketFetcher.Close() doesn't close netkit links.

PacketFetcher tracks netkit links in netkitPrimaryLink and netkitPeerLink maps (lines 1757-1758), but Close() never closes them. Compare with FlowFetcher.Close() (lines 902-917) which properly iterates and closes both netkit link maps.

Suggested fix
 func (p *PacketFetcher) Close() error {
 	log.Debug("unregistering eBPF objects")

 	var errs []error
+	for _, l := range p.netkitPrimaryLink {
+		if l == nil {
+			continue
+		}
+		if err := l.Close(); err != nil {
+			errs = append(errs, err)
+		}
+	}
+	for _, l := range p.netkitPeerLink {
+		if l == nil {
+			continue
+		}
+		if err := l.Close(); err != nil {
+			errs = append(errs, err)
+		}
+	}
+
 	if p.perfReader != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tracer/tracer.go` around lines 2277 - 2348, PacketFetcher.Close() never
closes netkit links stored in netkitPrimaryLink and netkitPeerLink; add logic
mirroring FlowFetcher.Close to iterate over p.netkitPrimaryLink and
p.netkitPeerLink, log per-interface, call Close() on each link, and then reset
the maps to empty (e.g., p.netkitPrimaryLink =
map[ifaces.InterfaceKey]link.Link{} and p.netkitPeerLink =
map[ifaces.InterfaceKey]link.Link{}). Place this block alongside the other link
cleanup (with the egressTCXLink/ingressTCXLink cleanup) so netkit links are
always closed during Close(), regardless of earlier errors; use the same logging
pattern and Close calls as used for egressTCXLink/ingressTCXLink and reference
PacketFetcher.Close, netkitPrimaryLink, netkitPeerLink, and FlowFetcher.Close
for guidance.
♻️ Duplicate comments (1)
pkg/tracer/tracer.go (1)

554-570: ⚠️ Potential issue | 🔴 Critical

Add runtime.LockOSThread() to prevent goroutine migration during namespace switch.

The netns library requires the goroutine to be pinned to its OS thread when performing namespace operations. Without runtime.LockOSThread(), the goroutine may migrate between netns.Get() and unix.Setns(), causing eBPF attachment in the wrong namespace.

Suggested fix
 func withNetNS(targetNS netns.NsHandle, fn func() error) error {
 	if targetNS == netns.None() {
 		return fn()
 	}
+	runtime.LockOSThread()
+	defer runtime.UnlockOSThread()
 	originalNS, err := netns.Get()
 	if err != nil {
+		return fmt.Errorf("failed to get current netns: %w", err)
-		return fmt.Errorf("failed to get current netns: %w", err)
 	}

Also requires adding "runtime" to the imports.

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/tracer/tracer.go` around lines 554 - 570, The withNetNS function must pin
the goroutine to its OS thread during namespace operations to avoid migration:
call runtime.LockOSThread() at the start of withNetNS and defer
runtime.UnlockOSThread() (inside the same function) so the goroutine stays on
the same thread while calling netns.Get(), unix.Setns and running fn(); also
ensure the "runtime" package is added to the imports and keep the existing
restore/close defer (originalNS.Close and netns.Set(originalNS)) unchanged so
the namespace is restored and resources freed even on errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/tracer/tracer.go`:
- Around line 213-214: The variable supportNetkit is inverted:
kernel.IsKernelOlderThan("6.7.0") returns true for kernels below 6.7.0 but
netkit is only supported on kernels >= 6.7.0, so change the assignment of
supportNetkit to the negation of kernel.IsKernelOlderThan("6.7.0") before
calling kernelSpecificLoadAndAssign (the call using oldKernel, rtOldKernel,
supportNetworkEvents, supportNetkit, spec, pinDir) so netkit objects are only
loaded on supported kernels.

---

Outside diff comments:
In `@pkg/tracer/tracer.go`:
- Around line 2277-2348: PacketFetcher.Close() never closes netkit links stored
in netkitPrimaryLink and netkitPeerLink; add logic mirroring FlowFetcher.Close
to iterate over p.netkitPrimaryLink and p.netkitPeerLink, log per-interface,
call Close() on each link, and then reset the maps to empty (e.g.,
p.netkitPrimaryLink = map[ifaces.InterfaceKey]link.Link{} and p.netkitPeerLink =
map[ifaces.InterfaceKey]link.Link{}). Place this block alongside the other link
cleanup (with the egressTCXLink/ingressTCXLink cleanup) so netkit links are
always closed during Close(), regardless of earlier errors; use the same logging
pattern and Close calls as used for egressTCXLink/ingressTCXLink and reference
PacketFetcher.Close, netkitPrimaryLink, netkitPeerLink, and FlowFetcher.Close
for guidance.

---

Duplicate comments:
In `@pkg/tracer/tracer.go`:
- Around line 554-570: The withNetNS function must pin the goroutine to its OS
thread during namespace operations to avoid migration: call
runtime.LockOSThread() at the start of withNetNS and defer
runtime.UnlockOSThread() (inside the same function) so the goroutine stays on
the same thread while calling netns.Get(), unix.Setns and running fn(); also
ensure the "runtime" package is added to the imports and keep the existing
restore/close defer (originalNS.Close and netns.Set(originalNS)) unchanged so
the namespace is restored and resources freed even on errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7606c874-9232-46ec-a3a7-d3dc4d016a77

📥 Commits

Reviewing files that changed from the base of the PR and between 67a7fea and 022fdb1.

⛔ Files ignored due to path filters (12)
  • bpf/headers/vmlinux_amd64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_arm64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_ppc64le.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_s390.h is excluded by !**/vmlinux*.h
  • pkg/ebpf/bpf_arm64_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_arm64_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
  • pkg/ebpf/bpf_powerpc_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_powerpc_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
  • pkg/ebpf/bpf_s390_bpfeb.go is excluded by !**/*_bpfeb.go
  • pkg/ebpf/bpf_s390_bpfeb.o is excluded by !**/*.o, !**/*_bpfeb.o
  • pkg/ebpf/bpf_x86_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_x86_bpfel.o is excluded by !**/*.o, !**/*_bpfel.o
📒 Files selected for processing (3)
  • bpf/flows.c
  • bpf/pca.h
  • pkg/tracer/tracer.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • bpf/pca.h

Comment thread pkg/tracer/tracer.go
red-hat-konflux Bot and others added 3 commits May 26, 2026 10:37
….8.0-crc0 (#590)

Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
Co-authored-by: red-hat-konflux[bot] <126015336+red-hat-konflux[bot]@users.noreply.github.com>
Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
Co-authored-by: red-hat-konflux[bot] <126015336+red-hat-konflux[bot]@users.noreply.github.com>
Signed-off-by: Mohamed S. Mahmoud <mmahmoud2201@gmail.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/tracer/tracer.go (1)

1989-1996: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

PacketFetcher.Register creates orphan qdisc on netkit interfaces.

registerInterface() creates a clsact qdisc before the netkit check. On netkit paths, the qdisc is created but never stored in p.qdiscs, so it's never cleaned up.

Move netkit detection before qdisc creation (matching FlowFetcher.Register at lines 728-738):

Suggested fix
 func (p *PacketFetcher) Register(iface *ifaces.Interface) error {
-	qdisc, ipvlan, err := registerInterface(iface)
+	ilog := plog.WithField("iface", iface)
+	handle, err := netlink.NewHandleAt(iface.NetNS)
 	if err != nil {
-		return err
+		return fmt.Errorf("failed to create handle for netns (%s): %w", iface.NetNS.String(), err)
 	}
-	if n, ok := ipvlan.(*netlink.Netkit); ok && n.Type() == "netkit" {
+	defer handle.Close()
+
+	ipvlan, err := handle.LinkByIndex(iface.Index)
+	if err != nil {
+		return fmt.Errorf("failed to lookup ipvlan device %d (%s): %w", iface.Index, iface.Name, err)
+	}
+
+	// Check netkit BEFORE creating qdisc
+	if n, ok := ipvlan.(*netlink.Netkit); ok && n.Type() == "netkit" {
+		ilog.Debug("detected netkit interface; attaching via netkit hooks")
 		return p.registerNetkit(iface)
 	}
 
+	qdisc, _, err := registerInterface(iface)
+	if err != nil {
+		return err
+	}
 	p.qdiscs[iface.InterfaceKey] = qdisc
♻️ Duplicate comments (2)
pkg/tracer/tracer.go (2)

216-217: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Inverted supportNetkit logic will load netkit programs on unsupported kernels.

kernel.IsKernelOlderThan("6.7.0") returns true for kernels below 6.7.0, but netkit requires kernel ≥ 6.7.0. Current logic enables netkit on old kernels.

-supportNetkit := kernel.IsKernelOlderThan("6.7.0")
+supportNetkit := !kernel.IsKernelOlderThan("6.7.0")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tracer/tracer.go` around lines 216 - 217, supportNetkit is set using
kernel.IsKernelOlderThan("6.7.0") which is inverted for netkit (netkit requires
kernel ≥ 6.7.0); change the assignment so supportNetkit is true only when the
kernel is at least 6.7.0 (e.g. negate the current check or use an
IsKernelAtLeast helper), so the call to kernelSpecificLoadAndAssign(oldKernel,
rtOldKernel, supportNetworkEvents, supportNetkit, spec, pinDir) passes the
correct boolean for netkit support.

551-567: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add runtime.LockOSThread() to prevent goroutine migration during namespace switch.

Without thread locking, the goroutine can migrate OS threads between Setns() and fn(), causing BPF attachment in the wrong namespace.

Suggested fix
+import "runtime"
+
 func withNetNS(targetNS netns.NsHandle, fn func() error) error {
 	if targetNS == netns.None() {
 		return fn()
 	}
+	runtime.LockOSThread()
+	defer runtime.UnlockOSThread()
 	originalNS, err := netns.Get()
 	if err != nil {
+		return fmt.Errorf("failed to get current netns: %w", err)
-		return fmt.Errorf("failed to get current netns: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/tracer/tracer.go` around lines 551 - 567, The withNetNS helper must lock
the OS thread before performing Setns and executing fn to prevent goroutine
migration; update withNetNS to call runtime.LockOSThread() immediately before
the namespace switch and ensure a deferred runtime.UnlockOSThread() runs after
restoring the original namespace (keep the existing defer that restores
originalNS and closes it, but move/unify defers so UnlockOSThread executes after
netns.Set(originalNS) and originalNS.Close()); this guarantees the goroutine
stays on the same thread for unix.Setns and the subsequent call to fn.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@pkg/tracer/tracer.go`:
- Around line 216-217: supportNetkit is set using
kernel.IsKernelOlderThan("6.7.0") which is inverted for netkit (netkit requires
kernel ≥ 6.7.0); change the assignment so supportNetkit is true only when the
kernel is at least 6.7.0 (e.g. negate the current check or use an
IsKernelAtLeast helper), so the call to kernelSpecificLoadAndAssign(oldKernel,
rtOldKernel, supportNetworkEvents, supportNetkit, spec, pinDir) passes the
correct boolean for netkit support.
- Around line 551-567: The withNetNS helper must lock the OS thread before
performing Setns and executing fn to prevent goroutine migration; update
withNetNS to call runtime.LockOSThread() immediately before the namespace switch
and ensure a deferred runtime.UnlockOSThread() runs after restoring the original
namespace (keep the existing defer that restores originalNS and closes it, but
move/unify defers so UnlockOSThread executes after netns.Set(originalNS) and
originalNS.Close()); this guarantees the goroutine stays on the same thread for
unix.Setns and the subsequent call to fn.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 04cde99a-6172-4e56-a2f2-56669e931a27

📥 Commits

Reviewing files that changed from the base of the PR and between 990ee2c and 446b320.

⛔ Files ignored due to path filters (8)
  • bpf/headers/vmlinux_amd64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_arm64.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_ppc64le.h is excluded by !**/vmlinux*.h
  • bpf/headers/vmlinux_s390.h is excluded by !**/vmlinux*.h
  • pkg/ebpf/bpf_arm64_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_powerpc_bpfel.go is excluded by !**/*_bpfel.go
  • pkg/ebpf/bpf_s390_bpfeb.go is excluded by !**/*_bpfeb.go
  • pkg/ebpf/bpf_x86_bpfel.go is excluded by !**/*_bpfel.go
📒 Files selected for processing (3)
  • bpf/flows.c
  • bpf/pca.h
  • pkg/tracer/tracer.go

@openshift-ci

openshift-ci Bot commented May 26, 2026

Copy link
Copy Markdown

@msherif1234: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/netobserv-cli-tests 446b320 link false /test netobserv-cli-tests
ci/prow/qe-e2e-tests 446b320 link false /test qe-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review Tells that the PR needs a review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant