geolocation: add ICMP pinger to geoprobe-agent#3427
Conversation
90b0846 to
af75fcf
Compare
Extract icmpSocket interface from icmpConn to enable mock-based testing without CAP_NET_RAW. Replace unsafe.Pointer timestamp parsing with encoding/binary. Increase epoll events array from 1 to 16 to reduce syscall overhead in batch receive.
Fail fast at validation time instead of silently failing at send time. The ICMP socket is AF_INET (IPv4-only), so IPv6 addresses would be added to the probe map but every measurement would error.
ValidateICMP now guarantees IPv4, so ParseIP cannot return nil. Use .To4() directly.
Rewrite icmp_pinger_test.go to use a mockICMPSocket that implements the icmpSocket interface. Tests cover add/remove, duplicate handling, MeasureOne (success, timeout, wrong seq/ID), MeasureAll (multi-probe, empty, context cancellation), and Close. These run in CI without CAP_NET_RAW. Integration tests retained at the bottom, gated on CAP_NET_RAW.
Follow Go convention of all-caps for acronyms, consistent with ICMPPinger and ICMPPingerConfig.
…helper Replace 15-parameter runMeasurementLoop with a measurementLoop struct. Extract reconcileTargets helper to deduplicate the identical TWAMP and ICMP target reconciliation logic.
Prevent Close from closing the fd while a measurement is in-flight. Without this, a concurrent MeasureOne or MeasureAll blocked in epoll_wait could have its fd closed underneath it.
Callers must serialize access; ICMPPinger does this via measMu.
af75fcf to
05ba5e3
Compare
Code reviewFound 1 issue:
doublezero/controlplane/telemetry/cmd/geoprobe-agent/main.go Lines 530 to 540 in 05ba5e3 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
Minor: doublezero/controlplane/telemetry/internal/geoprobe/icmp_pinger.go Lines 79 to 84 in 05ba5e3 doublezero/controlplane/telemetry/internal/geoprobe/address.go Lines 81 to 96 in 05ba5e3 |
|
Minor: In doublezero/controlplane/telemetry/internal/geoprobe/target_discovery.go Lines 105 to 108 in 05ba5e3 |
|
I think this can lead to bad offset data. Claude:
The impact cascades: the bogus doublezero/controlplane/telemetry/internal/geoprobe/icmp_pinger.go Lines 178 to 180 in 05ba5e3 doublezero/controlplane/telemetry/internal/geoprobe/icmp_pinger.go Lines 239 to 241 in 05ba5e3 |
This is on purpose. Failing later is worse than failing earlier. It should just fail out so we fix the permissions. |
The pinger keys probes by host only, so two entries for the same host with different offset ports would silently drop the second. Align the parser to match by deduplicating on host instead of host:port:0.
Check icmpTargets == nil alongside targets and inboundKeys when deciding whether a discovery scan was skipped.
Matches the TWAMP pinger's decideRTT pattern. Without the clamp, clock skew or anomalous kernel timestamps could produce a negative duration that wraps to ~MaxUint64 when cast to uint64, corrupting offset data.
|
@nikw9944 : Addressed 2, 3, 4. 1 was intentional. |
Summary of Changes
ICMPPinger) to geoprobe-agent that measures outbound ICMP targets using raw sockets with kernel timestamps (SO_TIMESTAMPNS), interleaved batch send/receive, and epoll-based I/OOutboundIcmptargets from onchain data alongside TWAMP targetsmeasurementLoopstruct with a sharedreconcileTargetshelper that deduplicates the add/remove/measure-new-targets logic for both TWAMP and ICMP channelsDiff Breakdown
Most of the new code is the ICMP pinger and its tests; the measurement loop refactor is net-neutral.
Key files (click to expand)
controlplane/telemetry/internal/geoprobe/icmp_pinger.go— newICMPPingerwith batch send/receive,MeasureOne,MeasureAll, probe lifecyclecontrolplane/telemetry/internal/geoprobe/icmp_pinger_test.go— mock-based unit tests + CAP_NET_RAW integration tests for the pingercontrolplane/telemetry/cmd/geoprobe-agent/main.go— refactor tomeasurementLoopstruct, wire ICMP pinger + targets, add--additional-icmp-targetsflagcontrolplane/telemetry/internal/geoprobe/icmp_conn.go— raw ICMP socket with epoll, kernel timestamp parsing, IPv4 header strippingcontrolplane/telemetry/internal/geoprobe/icmp_conn_test.go— round-trip and edge-case tests for the raw socket layercontrolplane/telemetry/internal/geoprobe/target_discovery.go— discoverOutboundIcmptargets from onchain data, merge with CLI targetscontrolplane/telemetry/internal/geoprobe/address.go— addValidateICMP(),ParseICMPProbeAddresses(), IPv4-only enforcementcontrolplane/telemetry/internal/metrics/geolocation_metrics.go— add ICMP target count gauge and measurement cycle duration histogramTesting Verification
ICMPPingercover add/remove, duplicate handling,MeasureOne(success, timeout, mismatched seq/ID),MeasureAll(multi-target, empty, context cancellation), andCloseCAP_NET_RAW) verify real ICMP echo round-trips to localhosticmpConntests cover raw socket round-trip, deadline expiry, IPv6 rejection, and IPv4 header strippingValidateICMPandParseICMPProbeAddressestested including IPv6 rejection, deduplication, and invalid input