Skip to content

Add feature flag span enrichment system tests#6828

Draft
sameerank wants to merge 5 commits into
mainfrom
sameerank/FFL-2201/feature-flag-apm-enrichment-tests
Draft

Add feature flag span enrichment system tests#6828
sameerank wants to merge 5 commits into
mainfrom
sameerank/FFL-2201/feature-flag-apm-enrichment-tests

Conversation

@sameerank
Copy link
Copy Markdown
Contributor

@sameerank sameerank commented Apr 29, 2026

Motivation

Add system tests for feature flag event enrichment (Feature ID 551) that validates APM span chunk tags:

  • ffe_flags_enc: base64(delta varint encoded serial IDs) for all evaluated flags
  • ffe_subjects_enc: subject hash → serial IDs mapping (when doLog=true)
  • ffe_runtime_defaults: flag name → default value mapping (when flag not found in UFC)

Example span tags:

{
  "ffe_runtime_defaults": {
    "flag-not-found": "my-default-value"
  },
  "ffe_flags_enc": "ZAgUAg==",
  "ffe_subjects_enc": {
    "4208f8d8017ce252df51f51e8c5b558a": "ZA==",
    "8e526f09f0531278f8bd60f224adf60a": "bAg="
  }
}

In this example: "ZAgUAg==" decodes to serial IDs [100, 108, 128, 130]

JIRA: https://datadoghq.atlassian.net/browse/FFL-2201

Changes

New test files:

  • tests/parametric/test_ffe/test_span_enrichment.py - Span enrichment tests (17 tests)
  • tests/parametric/test_ffe/span-enrichment-flags.json - UFC fixture with serialId in splits
  • tests/parametric/test_ffe/utils.py - Delta varint encoding/decoding utilities + SHA256 hashing

Manifest updates:

  • Added missing_feature entries to language manifests (tests written before tracer implementation)

Why parametric only (no E2E)?

  • Span enrichment is internal tracer behavior, not weblog behavior
  • Parametric tests provide precise control for edge cases (limits, child span propagation)

⚠️ Important: Span Context for Span Enrichment Tests

The ffe/evaluate endpoint now accepts an optional span_id parameter.

This is required for span enrichment tests because:

  1. The parametric server's /trace/span/start creates spans but does NOT activate them in the tracer scope
  2. When /ffe/evaluate runs as a separate HTTP request, tracer.scope().active() returns null
  3. The span_id parameter tells the server to activate the specified span during flag evaluation

Usage pattern:

with test_library.dd_start_span(name="test-span", service="test-service") as span:
    test_library.ffe_evaluate(
        flag="my-flag",
        variation_type="BOOLEAN",
        default_value=False,
        targeting_key="user-123",
        attributes={},
        span_id=span.span_id,  # <-- Required for span enrichment
    )

🧪 Local Testing

# Point system-tests to your local dd-trace-js
echo "../dd-trace-js" > binaries/nodejs-load-from-local

# Run the tests
TEST_LIBRARY=nodejs ./run.sh PARAMETRIC tests/parametric/test_ffe/test_span_enrichment.py

# Cleanup when done
rm binaries/nodejs-load-from-local

Related PRs

Reviewer checklist

  • Anything but tests/ or manifests/ is modified ? I have the approval from R&P team
  • A docker base image is modified?
    • the relevant build-XXX-image label is present
  • A scenario is added, removed or renamed?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

CODEOWNERS have been resolved as:

tests/parametric/test_ffe/span-enrichment-flags.json                    @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
tests/parametric/test_ffe/test_span_enrichment.py                       @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
tests/parametric/test_ffe/utils.py                                      @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
manifests/cpp.yml                                                       @DataDog/dd-trace-cpp
manifests/dotnet.yml                                                    @DataDog/apm-dotnet @DataDog/asm-dotnet
manifests/golang.yml                                                    @DataDog/dd-trace-go-guild
manifests/java.yml                                                      @DataDog/asm-java @DataDog/apm-java
manifests/nodejs.yml                                                    @DataDog/dd-trace-js
manifests/php.yml                                                       @DataDog/apm-php @DataDog/asm-php
manifests/python.yml                                                    @DataDog/apm-python @DataDog/asm-python
manifests/ruby.yml                                                      @DataDog/ruby-guild @DataDog/asm-ruby
manifests/rust.yml                                                      @DataDog/apm-rust
tests/parametric/test_ffe/test_dynamic_evaluation.py                    @DataDog/feature-flagging-and-experimentation-sdk @DataDog/system-tests-core
utils/_features.py                                                      @DataDog/system-tests-core
utils/build/docker/nodejs/parametric/server.js                          @DataDog/dd-trace-js @DataDog/system-tests-core
utils/docker_fixtures/_test_clients/_test_client_parametric.py          @DataDog/system-tests-core

@sameerank sameerank force-pushed the sameerank/FFL-2201/feature-flag-apm-enrichment-tests branch 6 times, most recently from dfe6168 to 537d098 Compare April 29, 2026 16:09
Copy link
Copy Markdown
Contributor Author

@sameerank sameerank Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File name enforced by CI rule

if file in ("utils.py", "conftest.py", "__init__.py"):

Comment on lines +6 to +7
3. Max serial ID limit (128)
4. Max subjects limit (25)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just picked these numbers randomly, provisionally so I could write tests. What limits should we actually use?

@sameerank sameerank force-pushed the sameerank/FFL-2201/feature-flag-apm-enrichment-tests branch from 537d098 to 7370a3b Compare April 29, 2026 17:39
@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented Apr 29, 2026

Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 11 Tests failed

tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Child_Span_Propagation.test_child_span_flag_evaluation_propagates_to_root[library_env0, parametric-nodejs] from system_tests_suite   View in Datadog   (Fix with Cursor)
AssertionError: ffe_flags_enc not found in root span meta: ['_dd.tags.process', '_dd.p.tid', '_dd.p.dm', 'service', 'runtime-id', '_dd.rc.client_id', '_dd.integration', '_dd.base_service', 'language']
assert 'ffe_flags_enc' in {'_dd.base_service': 'nodejs', '_dd.integration': 'opentracing', '_dd.p.dm': '-0', '_dd.p.tid': '6a06c4b800000000', ...}

self = <tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Child_Span_Propagation object at 0x7fed6ecf9cd0>
test_agent = <utils.docker_fixtures._test_agent.TestAgentAPI object at 0x7fed6c174cb0>
test_library = <utils.docker_fixtures._test_clients._test_client_parametric.ParametricTestClientApi object at 0x7fed6c7e5c40>

    @parametrize("library_env", [{**DEFAULT_ENVVARS}])
    def test_child_span_flag_evaluation_propagates_to_root(
        self, test_agent: TestAgentAPI, test_library: APMLibrary
...
tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Default_Fallback.test_ffe_runtime_defaults_value_truncated_at_64_chars[library_env0, parametric-nodejs] from system_tests_suite   View in Datadog   (Fix with Cursor)
AssertionError: ffe_runtime_defaults not found in span meta: ['_dd.tags.process', '_dd.p.tid', '_dd.p.dm', 'service', 'runtime-id', '_dd.rc.client_id', '_dd.integration', '_dd.base_service', 'language']
assert 'ffe_runtime_defaults' in {'_dd.base_service': 'nodejs', '_dd.integration': 'opentracing', '_dd.p.dm': '-0', '_dd.p.tid': '6a06c4b900000000', ...}

self = <tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Default_Fallback object at 0x7f6d685e0f80>
test_agent = <utils.docker_fixtures._test_agent.TestAgentAPI object at 0x7f6d33b61d90>
test_library = <utils.docker_fixtures._test_clients._test_client_parametric.ParametricTestClientApi object at 0x7f6d33430860>

    @parametrize("library_env", [{**DEFAULT_ENVVARS}])
    def test_ffe_runtime_defaults_value_truncated_at_64_chars(
        self, test_agent: TestAgentAPI, test_library: APMLibrary
...
tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Default_Fallback.test_flag_not_found_adds_ffe_runtime_defaults_tag[library_env0, parametric-nodejs] from system_tests_suite   View in Datadog   (Fix with Cursor)
AssertionError: ffe_runtime_defaults not found in span meta: ['_dd.tags.process', '_dd.p.tid', '_dd.p.dm', 'service', 'runtime-id', '_dd.rc.client_id', '_dd.integration', '_dd.base_service', 'language']
assert 'ffe_runtime_defaults' in {'_dd.base_service': 'nodejs', '_dd.integration': 'opentracing', '_dd.p.dm': '-0', '_dd.p.tid': '6a06c4b800000000', ...}

self = <tests.parametric.test_ffe.test_span_enrichment.Test_Span_Enrichment_Default_Fallback object at 0x7ff13a0aa630>
test_agent = <utils.docker_fixtures._test_agent.TestAgentAPI object at 0x7ff139448c50>
test_library = <utils.docker_fixtures._test_clients._test_client_parametric.ParametricTestClientApi object at 0x7ff13973f560>

    @parametrize("library_env", [{**DEFAULT_ENVVARS}])
    def test_flag_not_found_adds_ffe_runtime_defaults_tag(
        self, test_agent: TestAgentAPI, test_library: APMLibrary
...
View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: bd54cb0 | Docs | Datadog PR Page | Give us feedback!

@sameerank sameerank force-pushed the sameerank/FFL-2201/feature-flag-apm-enrichment-tests branch 3 times, most recently from e63833a to 12da3a9 Compare May 5, 2026 00:06
Add parametric tests for feature flag APM span enrichment:

- Test_Span_Enrichment_Serial_IDs: verify multiple flag evaluations combine
  serial IDs correctly in feature_flags_encoded using delta varint encoding
- Test_Span_Enrichment_Child_Span_Propagation: verify flag evaluations in
  child spans propagate to root span
- Test_Span_Enrichment_Max_Serial_IDs: verify 128 serial ID limit enforcement
- Test_Span_Enrichment_Max_Subjects: verify 25 subject limit enforcement
- Test_Span_Enrichment_Default_Fallback: verify feature_flags tag with
  coded-default prefix when flag not found in UFC, including 64 char truncation
- Test_Span_Enrichment_Subjects: verify feature_flag_subjects_encoded behavior
  based on doLog flag, including SHA256 hashing of targeting keys
- Test_Span_Enrichment_Delta_Varint: unit tests for encoding utilities

Add utilities for delta varint encoding/decoding and SHA256 targeting key
hashing. Mark tests as missing_feature in all library manifests.
@sameerank sameerank force-pushed the sameerank/FFL-2201/feature-flag-apm-enrichment-tests branch from 12da3a9 to 2fa1e11 Compare May 5, 2026 00:09
sameerank added 4 commits May 7, 2026 15:34
- Add span_id parameter to ffe_evaluate endpoint to activate span
  during flag evaluation so SpanEnrichmentHook can find root span
- Update test_span_enrichment.py to pass span_id from dd_start_span
- Enable span enrichment tests for nodejs 6.0.0 in manifest
- Remove trailing space from coded-default prefix ("coded-default:" not "coded-default: ")
- Update max subjects limit from 25 to 10 per RFC
- Add test for multiple subjects tracked separately with SHA256 hashed keys
- Add test for single subject with multiple flags combining serial IDs
- Update module docstring with comprehensive test coverage list
Update span enrichment tests to use the renamed tag name and remove
the coded-default: prefix logic to match the RFC specification changes.

Changes:
- Rename ffe_defaults -> ffe_runtime_defaults throughout
- Remove CODED_DEFAULT_PREFIX constant and related assertions
- Update docstrings and comments to reflect new terminology
- Values are now stored directly without prefix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant