Integrate OpenTelemetry from adit-radis-shared#191
Integrate OpenTelemetry from adit-radis-shared#191samuelvkwong wants to merge 6 commits intomainfrom
Conversation
Use the shared telemetry module from adit-radis-shared to set up OpenTelemetry traces, metrics, and logs. Telemetry is initialized in manage.py and asgi.py before Django loads, and the OTel logging handler is added conditionally in settings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the
📝 WalkthroughWalkthroughOpenTelemetry tracing and logging are initialized early: Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as manage.py (CLI)
participant ASGI as radis/asgi.py (Server)
participant App as Django App
participant OTEL as otel-collector
participant OO as OpenObserve
CLI->>CLI: setup_opentelemetry() (init SDK & exporters)
CLI->>App: initialize_debugger() / run command
ASGI->>ASGI: setup_opentelemetry() (init SDK & exporters)
ASGI->>App: get_asgi_application()
App->>OTEL: export traces/logs/metrics (OTLP)
OTEL->>OO: otlphttp export (OPENOBSERVE_ENDPOINT + auth)
OO-->>OTEL: ingest ack
OTEL-->>App: health/status (optional)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @samuelvkwong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces robust observability capabilities by integrating OpenTelemetry into the application. It sets up the necessary infrastructure to collect telemetry data by initializing the tracing system at an early stage of application startup and configures logging to forward relevant data when telemetry is active. These changes are supported by updating the shared dependency module and incorporating new OpenTelemetry libraries, paving the way for better monitoring and debugging of the application's performance and behavior. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly integrates OpenTelemetry for tracing and logging by initializing it early in manage.py and radis/asgi.py. The logging configuration is also properly updated. My only concern is with the dependency on a git branch in pyproject.toml, which can lead to non-reproducible builds. I've suggested pinning to a specific commit hash to improve stability. Otherwise, the changes are solid.
| requires-python = ">=3.12,<4.0" | ||
| dependencies = [ | ||
| "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@0.19.1", | ||
| "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve", |
There was a problem hiding this comment.
Depending on a git branch (openobserve) can lead to unpredictable and non-reproducible builds, as the branch can be updated or deleted. It's much safer to pin to a specific commit hash or a tag.
Since the associated PR in adit-radis-shared is not yet merged, I recommend pinning to the specific commit hash from that branch for now. Once the shared library PR is merged and a new version is released, this should be updated to point to the new version tag.
| "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve", | |
| "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@fd2c783a389d2ea9c275b22a794a99f0fa2ba382", |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@radis/asgi.py`:
- Around line 18-22: Remove the unused inline flake8 suppression comments by
deleting the trailing "# noqa: E402" on the import of setup_opentelemetry and
the import of get_asgi_application in radis/asgi.py; keep the calls/imports
(setup_opentelemetry() and get_asgi_application) unchanged, and only remove the
E402 suppressions (or enable E402 in config if you truly need to suppress it).
🧹 Nitpick comments (1)
pyproject.toml (1)
10-10: Prefer pinningadit-radis-sharedto a commit/tag for reproducible builds.A moving branch ref can introduce unreviewed changes into builds. Consider pinning to a commit SHA or a version tag once the dependency is stable.
♻️ Example pinning approach
- "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve", + "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@<commit-sha>",
| from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402 | ||
|
|
||
| setup_opentelemetry() | ||
|
|
||
| from django.core.asgi import get_asgi_application # noqa: E402 |
There was a problem hiding this comment.
Remove unused # noqa: E402 directives (RUF100).
Ruff reports these as unused; drop them (or enable E402 if you truly need suppression).
🧹 Suggested cleanup
-from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402
+from adit_radis_shared.telemetry import setup_opentelemetry
@@
-from django.core.asgi import get_asgi_application # noqa: E402
+from django.core.asgi import get_asgi_application📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402 | |
| setup_opentelemetry() | |
| from django.core.asgi import get_asgi_application # noqa: E402 | |
| from adit_radis_shared.telemetry import setup_opentelemetry | |
| setup_opentelemetry() | |
| from django.core.asgi import get_asgi_application |
🧰 Tools
🪛 Ruff (0.14.14)
[warning] 18-18: Unused noqa directive (non-enabled: E402)
Remove unused noqa directive
(RUF100)
[warning] 22-22: Unused noqa directive (non-enabled: E402)
Remove unused noqa directive
(RUF100)
🤖 Prompt for AI Agents
In `@radis/asgi.py` around lines 18 - 22, Remove the unused inline flake8
suppression comments by deleting the trailing "# noqa: E402" on the import of
setup_opentelemetry and the import of get_asgi_application in radis/asgi.py;
keep the calls/imports (setup_opentelemetry() and get_asgi_application)
unchanged, and only remove the E402 suppressions (or enable E402 in config if
you truly need to suppress it).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
example.env (1)
124-125:⚠️ Potential issue | 🟡 MinorFix typo in proxy note.
Line 124: “Malke” → “Make”.
Suggested change
-# Malke sure to use .local in NO_PROXY as otherwise the communication with +# Make sure to use .local in NO_PROXY as otherwise the communication with
🤖 Fix all issues with AI agents
In `@otel-collector-config.yaml`:
- Around line 36-41: The config currently sets otlphttp/openobserve
tls.insecure: true which disables TLS verification; change it so tls.insecure is
driven by an environment variable (e.g., OPENOBSERVE_TLS_INSECURE) with a secure
default of false, update the otlphttp/openobserve block to read the env var for
tls.insecure (falling back to false) and document that production should not set
it to true; ensure references to endpoint and headers remain unchanged and
validate the env parsing follows your config templating conventions.
🧹 Nitpick comments (2)
example.env (1)
112-120: Add a dev‑only warning for the default OpenObserve credentials.Line 112‑119: the defaults are handy for local use but can be unintentionally reused outside dev. A short warning keeps expectations clear and reminds that
OPENOBSERVE_AUTH_HEADERmust match the root creds.Suggested change
-# OpenObserve +# OpenObserve (dev-only defaults — change for production) ZO_ROOT_USER_EMAIL="admin@localhost.com" ZO_ROOT_USER_PASSWORD="admin" OPENOBSERVE_DEV_PORT=5080 # OpenTelemetry Configuration OPENOBSERVE_ENDPOINT=http://openobserve.local:5080/api/default +# OPENOBSERVE_AUTH_HEADER should match ZO_ROOT_USER_* (Basic base64(user:password)) OPENOBSERVE_AUTH_HEADER="Basic YWRtaW5AbG9jYWxob3N0LmNvbTphZG1pbg=="docker-compose.base.yml (1)
63-71: Pin the OpenObserve image to a specific version instead of usinglatest.Using
latestmakes deployments non-reproducible and risks unintended breaking changes. OpenObserve's documentation recommends pinning to a fixed release version (e.g.,v0.50.2) in production environments. Use an environment variable for the version tag and document it inexample.envto maintain reproducibility across environments.Suggested change
- openobserve: - image: public.ecr.aws/zinclabs/openobserve:latest + openobserve: + image: public.ecr.aws/zinclabs/openobserve:${OPENOBSERVE_IMAGE_TAG:?}
| otlphttp/openobserve: | ||
| endpoint: ${env:OPENOBSERVE_ENDPOINT} | ||
| headers: | ||
| Authorization: ${env:OPENOBSERVE_AUTH_HEADER} | ||
| tls: | ||
| insecure: true |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenTelemetry Collector otlphttp exporter tls.insecure configuration environment variables
💡 Result:
OpenTelemetry Collector doesn’t have a special “environment variable name” that directly maps to exporters.otlphttp.tls.insecure. Instead, you set it in Collector YAML and (optionally) substitute from an environment variable.
1) Use env-var substitution inside config.yaml
Collector configs support ${env:VAR} expansion. [1]
exporters:
otlphttp:
endpoint: ${env:OTLPHTTP_ENDPOINT}
tls:
insecure: ${env:OTLPHTTP_TLS_INSECURE}Example shell:
export OTLPHTTP_ENDPOINT="https://otel-gateway.internal:4318"
export OTLPHTTP_TLS_INSECURE="true"
otelcol --config=/etc/otelcol/config.yamltls.insecure: true is a standard Collector TLS client setting (“disable TLS verification”). [1][2]
2) (Alternative) Put the whole config in an env var and load it
The Collector can read config from an environment variable via --config=env:.... [1]
Important note (common confusion)
OTEL_EXPORTER_OTLP_INSECURE is an SDK OTLP exporter environment variable, and the spec notes “insecure” applies to OTLP/gRPC behavior; OTLP/HTTP uses the URL scheme (http vs https) for transport security decisions. So don’t expect OTEL_EXPORTER_OTLP_INSECURE to configure the Collector’s otlphttp exporter TLS. [3]
Sources: Collector config + env expansion + TLS options [1], Collector TLS config package docs [2], OTLP exporter spec (insecure env var semantics) [3].
🌐 Web query:
OpenTelemetry Collector configuration environment variable substitution syntax
💡 Result:
OpenTelemetry Collector config supports environment variable substitution using ${...}:
-
Basic
${env:VAR_NAME}(Collector docs example) [1]${VAR_NAME}is also considered valid by the OTel configuration data model spec (theenv:prefix is optional) [2]
-
Default value (bash-style)
${env:VAR_NAME:-default}(ifVAR_NAMEis unset/empty) [1][2]
-
Escaping a literal dollar sign
- Use
$$to render a literal$(no substitution) [1]
- Use
Notes:
- Substitution applies to scalar values (not YAML mapping keys) [2]
- If a variable is unset and no default is provided, it’s replaced with an empty value [2]
Example
processors:
attributes/example:
actions:
- key: ${env:DB_KEY:-mydefault}
action: ${env:OPERATION:-}
exporters:
prometheus:
namespace: $$DataVisualization[1]
Sources: OpenTelemetry Collector configuration docs [1], OTel configuration data model spec (env substitution ABNF and rules) [2].
🌐 Web query:
OpenTelemetry Collector production TLS settings security recommendations
💡 Result:
Production TLS security recommendations for the OpenTelemetry Collector
-
Enable TLS everywhere (receiver + exporter)
- Configure receiver-side TLS with
cert_file+key_fileto encrypt inbound telemetry. [1] - Configure exporter-side TLS with
ca_file(and optionally clientcert_file+key_file). [1]
- Configure receiver-side TLS with
-
Prefer mTLS for collector ingress (stronger authentication)
- On receivers, require client certs with
client_ca_file(this enforces “RequireAndVerifyClientCert”). [1][2]
- On receivers, require client certs with
-
Do not use insecure TLS flags in production
- Keep
insecure: false(don’t disable TLS). [1][2] - Keep
insecure_skip_verify: false(don’t skip cert verification). [1][2]
- Keep
-
Enforce modern TLS versions
- Set
min_versionto"1.2"at minimum, and prefer"1.3"where possible. TLS 1.0/1.1 are deprecated and should be avoided. [2][3]
- Set
-
Use safe cipher suites (mainly relevant to TLS 1.2)
- If you must pin suites, use
cipher_suites; otherwise the Collector uses a safe default list. [2][3]
- If you must pin suites, use
-
Plan certificate rotation
- Use
reload_intervalso the Collector reloads certificates without a restart (helps with automated rotation). [1][2]
- Use
-
Protect private keys and config secrets
- Treat TLS private keys and config as sensitive; store them securely (encrypted storage/secret store) and consider environment-variable expansion for sensitive values. [4]
Minimal hardened examples
OTLP receiver (gRPC) with mTLS
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
tls:
cert_file: /etc/otel/server.crt
key_file: /etc/otel/server.key
client_ca_file: /etc/otel/clients_ca.crt
min_version: "1.2"
reload_interval: 1hBased on the Collector TLS/mTLS options. [1][2]
OTLP exporter (gRPC) with server verification (and optional client cert)
exporters:
otlp:
endpoint: upstream-collector:4317
tls:
ca_file: /etc/otel/upstream_ca.crt
min_version: "1.2"
# optional for mTLS:
# cert_file: /etc/otel/client.crt
# key_file: /etc/otel/client.keyClient-side TLS configuration. [1][2]
Sources: OpenTelemetry Collector TLS configuration docs [1], Collector configtls reference (all TLS knobs, defaults, and version guidance) [2], Splunk OTel Collector TLS settings (same upstream options, includes TLS version/cipher guidance) [3], OpenTelemetry Collector config security best practices [4].
Avoid shipping tls.insecure: true as the default.
Lines 40–41: This disables TLS verification and is unsafe for production. The OpenTelemetry Collector recommends keeping insecure: false for all production deployments. Gate this setting by environment variable with a secure default:
Suggested change
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
- insecure: true
+ insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| otlphttp/openobserve: | |
| endpoint: ${env:OPENOBSERVE_ENDPOINT} | |
| headers: | |
| Authorization: ${env:OPENOBSERVE_AUTH_HEADER} | |
| tls: | |
| insecure: true | |
| otlphttp/openobserve: | |
| endpoint: ${env:OPENOBSERVE_ENDPOINT} | |
| headers: | |
| Authorization: ${env:OPENOBSERVE_AUTH_HEADER} | |
| tls: | |
| insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false} |
🤖 Prompt for AI Agents
In `@otel-collector-config.yaml` around lines 36 - 41, The config currently sets
otlphttp/openobserve tls.insecure: true which disables TLS verification; change
it so tls.insecure is driven by an environment variable (e.g.,
OPENOBSERVE_TLS_INSECURE) with a secure default of false, update the
otlphttp/openobserve block to read the env var for tls.insecure (falling back to
false) and document that production should not set it to true; ensure references
to endpoint and headers remain unchanged and validate the env parsing follows
your config templating conventions.
PR Review: OpenTelemetry IntegrationReviewed this PR - overall solid implementation. Details below: Strengths
Issues1. Security: Hardcoded Credentials (example.env:113-119)
2. Missing Health Checks (docker-compose.base.yml:63-84)
3. Missing Dependencies (docker-compose.base.yml:42-55)
4. Default OTEL_EXPORTER_OTLP_ENDPOINT (docker-compose.base.yml:34)
5. No Resource Limits (docker-compose.base.yml:63-84)
6. Collector Config (otel-collector-config.yaml:11-14)
7. Missing Documentation
Testing
Dependencies
Must Fix Before Merge
Should Fix
ApprovalAlmost ready after Must Fix items addressed. Core implementation follows OpenTelemetry best practices. Great work! |
Code Review: OpenTelemetry IntegrationThanks for implementing OpenTelemetry observability in RADIS! This is a solid foundation for monitoring and tracing. Here's my detailed review: ✅ Strengths
🔍 Issues & Concerns1. Dependency on unreleased branch (Critical)The PR depends on adit-radis-shared@openobserve branch instead of a released version. This creates risks:
Recommendation: Wait for adit-radis-shared#152 to be merged and released, then update to a tagged version 2. Security: Hardcoded credentials in example.env (High)The base64-encoded auth header contains default credentials (admin@localhost.com:admin). While this is in example.env, developers might copy these to production. Recommendations:
3. Always-on services increase resource usage (Medium)OpenObserve and otel-collector services are always started in base config, even when telemetry isn't needed. This adds significant memory overhead for development. Recommendation: Move these services to a Docker profile so they only start when needed. 4. Missing health check configuration (Low)The otel-collector has a health check endpoint at :13133 but no Docker health check is configured. 📊 Performance Considerations
🧪 Test CoverageMissing: No tests for telemetry integration. Consider adding unit tests for graceful degradation when OTEL_EXPORTER_OTLP_ENDPOINT is unset and integration tests verifying telemetry initializes without errors. 📝 Documentation Gaps
🎯 Overall AssessmentThis is a well-implemented feature with good architectural decisions. The main blocker is the dependency on an unreleased branch. Once that's resolved and documentation is added, this will be ready to merge. The integration follows OpenTelemetry best practices and the collector configuration is production-ready. Great work! |
Summary
manage.pyandradis/asgi.pybefore Django loadsradis/settings/base.pywhen telemetry is activeopenobservebranch (includes telemetry module)Dependencies
Test plan
uv syncresolves all dependencies correctlyOTEL_EXPORTER_OTLP_ENDPOINTis not setOTEL_EXPORTER_OTLP_ENDPOINTis set🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.