Skip to content

Comments

Integrate OpenTelemetry from adit-radis-shared#191

Open
samuelvkwong wants to merge 6 commits intomainfrom
openobserve
Open

Integrate OpenTelemetry from adit-radis-shared#191
samuelvkwong wants to merge 6 commits intomainfrom
openobserve

Conversation

@samuelvkwong
Copy link
Collaborator

@samuelvkwong samuelvkwong commented Jan 31, 2026

Summary

  • Integrate shared OpenTelemetry telemetry module from adit-radis-shared
  • Initialize telemetry in manage.py and radis/asgi.py before Django loads
  • Conditionally add OTel logging handler in radis/settings/base.py when telemetry is active
  • Update adit-radis-shared dependency to openobserve branch (includes telemetry module)

Dependencies

Test plan

  • Verify uv sync resolves all dependencies correctly
  • Verify RADIS starts without telemetry when OTEL_EXPORTER_OTLP_ENDPOINT is not set
  • Verify telemetry initializes when OTEL_EXPORTER_OTLP_ENDPOINT is set
  • Run existing tests to confirm no regressions

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Enabled OpenTelemetry tracing and logging across the app for improved monitoring.
    • Added local observability services and a collector so traces/metrics/logs can be captured during development.
    • Added environment variables to configure observability endpoints and credentials.
  • Chores

    • Updated shared dependency reference to integrate telemetry support.

✏️ Tip: You can customize this high-level summary in your review settings.

samuelvkwong and others added 2 commits January 31, 2026 00:21
Use the shared telemetry module from adit-radis-shared to set up
OpenTelemetry traces, metrics, and logs. Telemetry is initialized
in manage.py and asgi.py before Django loads, and the OTel logging
handler is added conditionally in settings.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 31, 2026

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

OpenTelemetry tracing and logging are initialized early: setup_opentelemetry() is called from manage.py and radis/asgi.py, a conditional OpenTelemetry logging handler is added in Django settings, the adit-radis-shared dependency is moved to the openobserve tag, and Docker compose plus an OTEL collector and OpenObserve service (with collector config) are added.

Changes

Cohort / File(s) Summary
Application entry points
manage.py, radis/asgi.py
Import and call setup_opentelemetry() before Django initialization to start the telemetry SDK for CLI and ASGI server startup.
Django settings
radis/settings/base.py
Add is_telemetry_active() check and call add_otel_logging_handler(LOGGING) when enabled to attach an OpenTelemetry logging handler.
Dependency update
pyproject.toml
Update adit-radis-shared git tag from 0.19.1 to openobserve.
Local dev & ops compose
docker-compose.base.yml, docker-compose.dev.yml, example.env
Add openobserve and otel-collector services, OTLP-related env vars and example credentials/ports, and a new volume for OpenObserve data.
OTel Collector config
otel-collector-config.yaml
Add complete OTLP collector config: OTLP receiver (gRPC/HTTP), processors (batch, resource, logs filter), otlphttp exporter to OpenObserve (auth via env), health_check extension, and pipelines for logs/traces/metrics.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as manage.py (CLI)
  participant ASGI as radis/asgi.py (Server)
  participant App as Django App
  participant OTEL as otel-collector
  participant OO as OpenObserve

  CLI->>CLI: setup_opentelemetry() (init SDK & exporters)
  CLI->>App: initialize_debugger() / run command

  ASGI->>ASGI: setup_opentelemetry() (init SDK & exporters)
  ASGI->>App: get_asgi_application()

  App->>OTEL: export traces/logs/metrics (OTLP)
  OTEL->>OO: otlphttp export (OPENOBSERVE_ENDPOINT + auth)
  OO-->>OTEL: ingest ack
  OTEL-->>App: health/status (optional)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰
I hopped in at startup, whiskers bright,
Wired traces and logs from morning to night,
Collectors hum, endpoints glow,
OpenObserve gathers the breadcrumbs I throw,
A rabbit's small observability delight. 🎩✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Integrate OpenTelemetry from adit-radis-shared' directly and accurately summarizes the primary change: integrating OpenTelemetry capabilities from the shared dependency across manage.py, asgi.py, settings, and docker-compose files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch openobserve

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @samuelvkwong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces robust observability capabilities by integrating OpenTelemetry into the application. It sets up the necessary infrastructure to collect telemetry data by initializing the tracing system at an early stage of application startup and configures logging to forward relevant data when telemetry is active. These changes are supported by updating the shared dependency module and incorporating new OpenTelemetry libraries, paving the way for better monitoring and debugging of the application's performance and behavior.

Highlights

  • OpenTelemetry Integration: The project now integrates OpenTelemetry for enhanced observability, allowing for the collection of traces, metrics, and logs.
  • Early Telemetry Initialization: OpenTelemetry is initialized early in the application's lifecycle, specifically in manage.py and radis/asgi.py, to ensure comprehensive tracing of all requests before Django fully loads.
  • Conditional Logging Handler: A new OpenTelemetry logging handler is conditionally added to the logging configuration in radis/settings/base.py, activating only when telemetry is enabled.
  • Dependency Updates: The adit-radis-shared dependency has been updated to reference the openobserve branch, and several new OpenTelemetry-related packages have been added to uv.lock to support the new telemetry features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly integrates OpenTelemetry for tracing and logging by initializing it early in manage.py and radis/asgi.py. The logging configuration is also properly updated. My only concern is with the dependency on a git branch in pyproject.toml, which can lead to non-reproducible builds. I've suggested pinning to a specific commit hash to improve stability. Otherwise, the changes are solid.

requires-python = ">=3.12,<4.0"
dependencies = [
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@0.19.1",
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Depending on a git branch (openobserve) can lead to unpredictable and non-reproducible builds, as the branch can be updated or deleted. It's much safer to pin to a specific commit hash or a tag.

Since the associated PR in adit-radis-shared is not yet merged, I recommend pinning to the specific commit hash from that branch for now. Once the shared library PR is merged and a new version is released, this should be updated to point to the new version tag.

Suggested change
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@fd2c783a389d2ea9c275b22a794a99f0fa2ba382",

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@radis/asgi.py`:
- Around line 18-22: Remove the unused inline flake8 suppression comments by
deleting the trailing "# noqa: E402" on the import of setup_opentelemetry and
the import of get_asgi_application in radis/asgi.py; keep the calls/imports
(setup_opentelemetry() and get_asgi_application) unchanged, and only remove the
E402 suppressions (or enable E402 in config if you truly need to suppress it).
🧹 Nitpick comments (1)
pyproject.toml (1)

10-10: Prefer pinning adit-radis-shared to a commit/tag for reproducible builds.

A moving branch ref can introduce unreviewed changes into builds. Consider pinning to a commit SHA or a version tag once the dependency is stable.

♻️ Example pinning approach
-    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
+    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@<commit-sha>",

Comment on lines +18 to +22
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402

setup_opentelemetry()

from django.core.asgi import get_asgi_application # noqa: E402
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused # noqa: E402 directives (RUF100).

Ruff reports these as unused; drop them (or enable E402 if you truly need suppression).

🧹 Suggested cleanup
-from adit_radis_shared.telemetry import setup_opentelemetry  # noqa: E402
+from adit_radis_shared.telemetry import setup_opentelemetry
@@
-from django.core.asgi import get_asgi_application  # noqa: E402
+from django.core.asgi import get_asgi_application
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402
setup_opentelemetry()
from django.core.asgi import get_asgi_application # noqa: E402
from adit_radis_shared.telemetry import setup_opentelemetry
setup_opentelemetry()
from django.core.asgi import get_asgi_application
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 18-18: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)


[warning] 22-22: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

🤖 Prompt for AI Agents
In `@radis/asgi.py` around lines 18 - 22, Remove the unused inline flake8
suppression comments by deleting the trailing "# noqa: E402" on the import of
setup_opentelemetry and the import of get_asgi_application in radis/asgi.py;
keep the calls/imports (setup_opentelemetry() and get_asgi_application)
unchanged, and only remove the E402 suppressions (or enable E402 in config if
you truly need to suppress it).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
example.env (1)

124-125: ⚠️ Potential issue | 🟡 Minor

Fix typo in proxy note.

Line 124: “Malke” → “Make”.

Suggested change
-# Malke sure to use .local in NO_PROXY as otherwise the communication with
+# Make sure to use .local in NO_PROXY as otherwise the communication with
🤖 Fix all issues with AI agents
In `@otel-collector-config.yaml`:
- Around line 36-41: The config currently sets otlphttp/openobserve
tls.insecure: true which disables TLS verification; change it so tls.insecure is
driven by an environment variable (e.g., OPENOBSERVE_TLS_INSECURE) with a secure
default of false, update the otlphttp/openobserve block to read the env var for
tls.insecure (falling back to false) and document that production should not set
it to true; ensure references to endpoint and headers remain unchanged and
validate the env parsing follows your config templating conventions.
🧹 Nitpick comments (2)
example.env (1)

112-120: Add a dev‑only warning for the default OpenObserve credentials.

Line 112‑119: the defaults are handy for local use but can be unintentionally reused outside dev. A short warning keeps expectations clear and reminds that OPENOBSERVE_AUTH_HEADER must match the root creds.

Suggested change
-# OpenObserve
+# OpenObserve (dev-only defaults — change for production)
 ZO_ROOT_USER_EMAIL="admin@localhost.com"
 ZO_ROOT_USER_PASSWORD="admin"
 OPENOBSERVE_DEV_PORT=5080

 # OpenTelemetry Configuration
 OPENOBSERVE_ENDPOINT=http://openobserve.local:5080/api/default
+# OPENOBSERVE_AUTH_HEADER should match ZO_ROOT_USER_* (Basic base64(user:password))
 OPENOBSERVE_AUTH_HEADER="Basic YWRtaW5AbG9jYWxob3N0LmNvbTphZG1pbg=="
docker-compose.base.yml (1)

63-71: Pin the OpenObserve image to a specific version instead of using latest.

Using latest makes deployments non-reproducible and risks unintended breaking changes. OpenObserve's documentation recommends pinning to a fixed release version (e.g., v0.50.2) in production environments. Use an environment variable for the version tag and document it in example.env to maintain reproducibility across environments.

Suggested change
-  openobserve:
-    image: public.ecr.aws/zinclabs/openobserve:latest
+  openobserve:
+    image: public.ecr.aws/zinclabs/openobserve:${OPENOBSERVE_IMAGE_TAG:?}

Comment on lines +36 to +41
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenTelemetry Collector otlphttp exporter tls.insecure configuration environment variables

💡 Result:

OpenTelemetry Collector doesn’t have a special “environment variable name” that directly maps to exporters.otlphttp.tls.insecure. Instead, you set it in Collector YAML and (optionally) substitute from an environment variable.

1) Use env-var substitution inside config.yaml

Collector configs support ${env:VAR} expansion. [1]

exporters:
  otlphttp:
    endpoint: ${env:OTLPHTTP_ENDPOINT}
    tls:
      insecure: ${env:OTLPHTTP_TLS_INSECURE}

Example shell:

export OTLPHTTP_ENDPOINT="https://otel-gateway.internal:4318"
export OTLPHTTP_TLS_INSECURE="true"
otelcol --config=/etc/otelcol/config.yaml

tls.insecure: true is a standard Collector TLS client setting (“disable TLS verification”). [1][2]

2) (Alternative) Put the whole config in an env var and load it

The Collector can read config from an environment variable via --config=env:.... [1]

Important note (common confusion)

OTEL_EXPORTER_OTLP_INSECURE is an SDK OTLP exporter environment variable, and the spec notes “insecure” applies to OTLP/gRPC behavior; OTLP/HTTP uses the URL scheme (http vs https) for transport security decisions. So don’t expect OTEL_EXPORTER_OTLP_INSECURE to configure the Collector’s otlphttp exporter TLS. [3]

Sources: Collector config + env expansion + TLS options [1], Collector TLS config package docs [2], OTLP exporter spec (insecure env var semantics) [3].


🌐 Web query:

OpenTelemetry Collector configuration environment variable substitution syntax

💡 Result:

OpenTelemetry Collector config supports environment variable substitution using ${...}:

  • Basic

    • ${env:VAR_NAME} (Collector docs example) [1]
    • ${VAR_NAME} is also considered valid by the OTel configuration data model spec (the env: prefix is optional) [2]
  • Default value (bash-style)

    • ${env:VAR_NAME:-default} (if VAR_NAME is unset/empty) [1][2]
  • Escaping a literal dollar sign

    • Use $$ to render a literal $ (no substitution) [1]

Notes:

  • Substitution applies to scalar values (not YAML mapping keys) [2]
  • If a variable is unset and no default is provided, it’s replaced with an empty value [2]

Example

processors:
  attributes/example:
    actions:
      - key: ${env:DB_KEY:-mydefault}
        action: ${env:OPERATION:-}
exporters:
  prometheus:
    namespace: $$DataVisualization

[1]

Sources: OpenTelemetry Collector configuration docs [1], OTel configuration data model spec (env substitution ABNF and rules) [2].


🌐 Web query:

OpenTelemetry Collector production TLS settings security recommendations

💡 Result:

Production TLS security recommendations for the OpenTelemetry Collector

  1. Enable TLS everywhere (receiver + exporter)

    • Configure receiver-side TLS with cert_file + key_file to encrypt inbound telemetry. [1]
    • Configure exporter-side TLS with ca_file (and optionally client cert_file + key_file). [1]
  2. Prefer mTLS for collector ingress (stronger authentication)

    • On receivers, require client certs with client_ca_file (this enforces “RequireAndVerifyClientCert”). [1][2]
  3. Do not use insecure TLS flags in production

    • Keep insecure: false (don’t disable TLS). [1][2]
    • Keep insecure_skip_verify: false (don’t skip cert verification). [1][2]
  4. Enforce modern TLS versions

    • Set min_version to "1.2" at minimum, and prefer "1.3" where possible. TLS 1.0/1.1 are deprecated and should be avoided. [2][3]
  5. Use safe cipher suites (mainly relevant to TLS 1.2)

    • If you must pin suites, use cipher_suites; otherwise the Collector uses a safe default list. [2][3]
  6. Plan certificate rotation

    • Use reload_interval so the Collector reloads certificates without a restart (helps with automated rotation). [1][2]
  7. Protect private keys and config secrets

    • Treat TLS private keys and config as sensitive; store them securely (encrypted storage/secret store) and consider environment-variable expansion for sensitive values. [4]

Minimal hardened examples

OTLP receiver (gRPC) with mTLS

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /etc/otel/server.crt
          key_file: /etc/otel/server.key
          client_ca_file: /etc/otel/clients_ca.crt
          min_version: "1.2"
          reload_interval: 1h

Based on the Collector TLS/mTLS options. [1][2]

OTLP exporter (gRPC) with server verification (and optional client cert)

exporters:
  otlp:
    endpoint: upstream-collector:4317
    tls:
      ca_file: /etc/otel/upstream_ca.crt
      min_version: "1.2"
      # optional for mTLS:
      # cert_file: /etc/otel/client.crt
      # key_file: /etc/otel/client.key

Client-side TLS configuration. [1][2]

Sources: OpenTelemetry Collector TLS configuration docs [1], Collector configtls reference (all TLS knobs, defaults, and version guidance) [2], Splunk OTel Collector TLS settings (same upstream options, includes TLS version/cipher guidance) [3], OpenTelemetry Collector config security best practices [4].


Avoid shipping tls.insecure: true as the default.

Lines 40–41: This disables TLS verification and is unsafe for production. The OpenTelemetry Collector recommends keeping insecure: false for all production deployments. Gate this setting by environment variable with a secure default:

Suggested change
   otlphttp/openobserve:
     endpoint: ${env:OPENOBSERVE_ENDPOINT}
     headers:
       Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
     tls:
-      insecure: true
+      insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: true
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false}
🤖 Prompt for AI Agents
In `@otel-collector-config.yaml` around lines 36 - 41, The config currently sets
otlphttp/openobserve tls.insecure: true which disables TLS verification; change
it so tls.insecure is driven by an environment variable (e.g.,
OPENOBSERVE_TLS_INSECURE) with a secure default of false, update the
otlphttp/openobserve block to read the env var for tls.insecure (falling back to
false) and document that production should not set it to true; ensure references
to endpoint and headers remain unchanged and validate the env parsing follows
your config templating conventions.

@claude
Copy link

claude bot commented Feb 1, 2026

PR Review: OpenTelemetry Integration

Reviewed this PR - overall solid implementation. Details below:

Strengths

  1. Proper initialization timing (setup_opentelemetry before Django loads)
  2. Conditional activation via is_telemetry_active()
  3. Good infrastructure design (OpenObserve + OTel collector)
  4. Smart filtering of health/static logs

Issues

1. Security: Hardcoded Credentials (example.env:113-119)

  • Base64 admin creds in OPENOBSERVE_AUTH_HEADER
  • Add warning comments for production

2. Missing Health Checks (docker-compose.base.yml:63-84)

  • otel-collector depends on openobserve without health checks
  • Add service_healthy condition

3. Missing Dependencies (docker-compose.base.yml:42-55)

  • App services should depend on otel-collector
  • Add to web, default_worker, llm_worker

4. Default OTEL_EXPORTER_OTLP_ENDPOINT (docker-compose.base.yml:34)

  • Default may confuse production deployments
  • Consider empty default, set only in dev

5. No Resource Limits (docker-compose.base.yml:63-84)

  • Add memory/CPU limits guidance

6. Collector Config (otel-collector-config.yaml:11-14)

  • Add memory_limiter processor for high log volumes

7. Missing Documentation

  • Add observability section to CLAUDE.md
  • Document OpenObserve UI access and telemetry toggle

Testing

  • No tests for telemetry initialization or graceful degradation

Dependencies

  • pyproject.toml uses unmerged @openobserve branch
  • Update after adit-radis-shared#152 merges

Must Fix Before Merge

  1. Security warning in example.env
  2. Documentation in CLAUDE.md
  3. Note dependency update needed

Should Fix

  1. Health check dependencies
  2. memory_limiter processor
  3. App service dependencies

Approval

Almost ready after Must Fix items addressed. Core implementation follows OpenTelemetry best practices. Great work!

@claude
Copy link

claude bot commented Feb 1, 2026

Code Review: OpenTelemetry Integration

Thanks for implementing OpenTelemetry observability in RADIS! This is a solid foundation for monitoring and tracing. Here's my detailed review:

✅ Strengths

  1. Proper initialization order: Correctly initializing telemetry before Django loads in both manage.py and asgi.py ensures all requests are instrumented
  2. Conditional activation: The is_telemetry_active() check allows graceful degradation when telemetry isn't configured
  3. Smart filtering: Excluding health checks and static files from logs reduces noise
  4. Comprehensive instrumentation: Includes Django, psycopg, and WSGI instrumentation
  5. Good separation of concerns: Telemetry logic is properly encapsulated in the shared library

🔍 Issues & Concerns

1. Dependency on unreleased branch (Critical)

The PR depends on adit-radis-shared@openobserve branch instead of a released version. This creates risks:

  • Unstable dependency that could change or disappear
  • CI/CD pipelines may fail if branch is force-pushed or deleted
  • Difficult for other developers to reproduce environments

Recommendation: Wait for adit-radis-shared#152 to be merged and released, then update to a tagged version

2. Security: Hardcoded credentials in example.env (High)

The base64-encoded auth header contains default credentials (admin@localhost.com:admin). While this is in example.env, developers might copy these to production.

Recommendations:

  • Add prominent comment warning these are development-only credentials
  • Consider using a placeholder or document password rotation best practices

3. Always-on services increase resource usage (Medium)

OpenObserve and otel-collector services are always started in base config, even when telemetry isn't needed. This adds significant memory overhead for development.

Recommendation: Move these services to a Docker profile so they only start when needed.

4. Missing health check configuration (Low)

The otel-collector has a health check endpoint at :13133 but no Docker health check is configured.

📊 Performance Considerations

  1. Batch processing is well-configured: 5s timeout and 1000 batch size are reasonable defaults
  2. Log filtering reduces overhead: Excluding health/static endpoints is good
  3. Consider adding trace sampling for high-traffic production environments

🧪 Test Coverage

Missing: No tests for telemetry integration. Consider adding unit tests for graceful degradation when OTEL_EXPORTER_OTLP_ENDPOINT is unset and integration tests verifying telemetry initializes without errors.

📝 Documentation Gaps

  1. CLAUDE.md should document the new telemetry feature (how to enable/disable, access OpenObserve UI, troubleshooting)
  2. New environment variables missing from CLAUDE.md
  3. README should mention observability as a feature

🎯 Overall Assessment

This is a well-implemented feature with good architectural decisions. The main blocker is the dependency on an unreleased branch. Once that's resolved and documentation is added, this will be ready to merge.

The integration follows OpenTelemetry best practices and the collector configuration is production-ready. Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant