Skip to content

implement span for device_id.connected#108

Merged
jay-418 merged 27 commits intomainfrom
jay/conn-stats
Mar 19, 2026
Merged

implement span for device_id.connected#108
jay-418 merged 27 commits intomainfrom
jay/conn-stats

Conversation

@jay-418
Copy link
Copy Markdown
Contributor

@jay-418 jay-418 commented Feb 13, 2026

purpose

Provide mechanism for recording a device_id's first connection to a route to provide an indication of poor network conditions.

mechanism

Emit a span with device_id (and some attrs) when a proxy is assigned by the API during config request. Then also emit a span when a device_id first connects to a proxy.

In SigNoz, we can calculate (using some kinda gnarly ClickHouse queries as discussed in https://github.com/getlantern/engineering/issues/2987):

  • the percentage of api config requests that never result in a proxy connections
  • the latency between config request and initial proxy connection

refactor + breaking changes

Eliminate custom OTEL in favor of standard vars.

before after notes
--telemetry-endpoint flag OTEL_EXPORTER_OTLP_ENDPOINT Required to enable telemetry
CUSTOM_OTLP_ENDPOINT OTEL_EXPORTER_OTLP_ENDPOINT Same var covers both cases
Auto-detected HTTP via non-443 port OTEL_EXPORTER_OTLP_INSECURE=true Only needed for plain HTTP endpoints
Hardcoded lantern-box service name OTEL_SERVICE_NAME Optional override, defaults to lantern-box
- OTEL_RESOURCE_ATTRIBUTES Optional extra attributes beyond .ini file

docker test dependency

Addition of "e2e" tests use docker and testcontainers to simulate all parts: client, singbox, otel, and upstream. Bypass with -short flag:
go test ./... -short

semconv

Using standard KVs for otel, with updates getlantern/semconv@121098b

@jay-418 jay-418 self-assigned this Feb 24, 2026
@jay-418 jay-418 marked this pull request as ready for review February 28, 2026 02:40
Copilot AI review requested due to automatic review settings February 28, 2026 02:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds OpenTelemetry tracing to correlate a device’s proxy assignment with its first observed proxy connection, enabling connectivity success-rate and time-to-connect analysis in SigNoz.

Changes:

  • Emit a device_id.connected span when CLIENTINFO is received on routed TCP/packet connections.
  • Add OTLP/HTTP trace exporter initialization alongside existing metrics exporter initialization.
  • Wire clientcontext manager into the run path by default and add a local ping script / Makefile flag plumbing.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tracker/clientcontext/manager.go Emits device_id.connected span with client attributes on successful CLIENTINFO parsing.
otel/otel.go Adds InitGlobalTracerProvider using OTLP/HTTP exporter and SDK tracer provider.
cmd/main.go Initializes global meter + tracer providers during CLI pre-run when proxy info is available.
cmd/cmd_run.go Always registers the clientcontext manager (previously conditional) to enable tracing.
cmd/Makefile Passes --telemetry-endpoint through make run.
scripts/ping.sh Adds a script intended to exercise CLIENTINFO and produce the new span.
go.mod / go.sum Adds OTLP trace HTTP exporter dependencies.
.gitignore Ignores cmd/cmd.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/Makefile Outdated
Comment thread scripts/ping.sh Outdated
Comment thread cmd/cmd_run.go Outdated
Comment thread tracker/clientcontext/manager.go Outdated
Comment thread tracker/clientcontext/manager.go Outdated
Comment thread tracker/clientcontext/manager.go Outdated
Comment thread cmd/main.go Outdated
@jay-418 jay-418 requested review from Crosse and garmr-ulfr February 28, 2026 03:59
@jay-418
Copy link
Copy Markdown
Contributor Author

jay-418 commented Mar 5, 2026

Hey @garmr-ulfr, do you mind looking at this? The CLIENTINFO stuff is a little foreign to me, but running ping.sh with a running lantern-box does send the right trace with attrs to the local collector.

For context, this is my first draft adding spans to lantern-box for the purpose of better success/failure tracking after config assignment:

It's been suggested this entire approach could be supplanted by using the strategy laid out in:

Copy link
Copy Markdown
Collaborator

@garmr-ulfr garmr-ulfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides ping.sh, LGTM. You implemented it correctly, but your test script is wrong.

Copy link
Copy Markdown

@Crosse Crosse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern is that I believe we need to drop a bunch of our OTel configuration complexity and just use the built-in environment variables.

Comment thread otel/otel.go Outdated
@jay-418
Copy link
Copy Markdown
Contributor Author

jay-418 commented Mar 9, 2026

My only concern is that I believe we need to drop a bunch of our OTel configuration complexity and just use the built-in environment variables.

@Crosse sorry I missed it the first time you said this. Will address today.

@Crosse
Copy link
Copy Markdown

Crosse commented Mar 9, 2026

@jay-418 You didn't miss anything—I never hit Send on the review last week.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 15 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread otel/otel_test.go
Comment thread otel/otel_test.go Outdated
Comment thread cmd/cmd_run.go Outdated
Comment thread tracker/clientcontext/manager.go Outdated
Comment thread tracker/clientcontext/tracker_test.go Outdated
Comment thread test/e2e/telemetry_test.go
Comment thread test/e2e/telemetry_test.go Outdated
@garmr-ulfr
Copy link
Copy Markdown
Collaborator

@jay-418 I'm not sure I've mentioned this yet, but just FYI in case you weren't aware, lantern-box must be usable by anyone, so any changes shouldn't restrict it to Lantern. Not that you have, just a heads up 😄

@jay-418
Copy link
Copy Markdown
Contributor Author

jay-418 commented Mar 11, 2026

@jay-418 I'm not sure I've mentioned this yet, but just FYI in case you weren't aware, lantern-box must be usable by anyone, so any changes shouldn't restrict it to Lantern. Not that you have, just a heads up 😄

Sure thing @garmr-ulfr. I just completed some heavier handed changes, but they make everything more generic by using the standard OTEL convetions for env vars.

@jay-418 jay-418 requested review from Crosse and garmr-ulfr March 11, 2026 01:48
Comment thread cmd/cmd_run.go Outdated
Comment thread cmd/main.go Outdated
Comment thread tracker/clientcontext/manager.go Outdated
Comment thread test/e2e/telemetry_test.go Outdated
@jay-418 jay-418 requested a review from garmr-ulfr March 14, 2026 03:44
Copy link
Copy Markdown
Collaborator

@garmr-ulfr garmr-ulfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This will also require an update in lantern-cloud to use the --proxy-info flag once this is merged and deployed.

@jay-418 jay-418 merged commit ebc50fc into main Mar 19, 2026
3 checks passed
@jay-418 jay-418 deleted the jay/conn-stats branch March 19, 2026 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants