Skip to content

ARO-26057: add ADX Grafana datasource provisioning#4878

Open
swiencki wants to merge 12 commits into
mainfrom
adx-grafana-datasource-upstream
Open

ARO-26057: add ADX Grafana datasource provisioning#4878
swiencki wants to merge 12 commits into
mainfrom
adx-grafana-datasource-upstream

Conversation

@swiencki
Copy link
Copy Markdown
Collaborator

@swiencki swiencki commented Apr 14, 2026

ARO-26057

What

Adds ADX/Kusto Grafana datasource provisioning through the geography rollout using the typed GrafanaDatasources action. The step forwards ADX desired state to grafanactl modify datasource reconcile, grants the Grafana managed identity Kusto Viewer access to ServiceLogs, and removes the prior rollout-time go run shell path.

Why

This follows the existing typed/prebuilt-binary pattern for Grafana datasource rollout while enabling SRE dashboards to query Kusto data from Grafana.

Testing

  • GOWORK=/tmp/aro-hcp-adx-local.work go test ./pkg/pipeline from tooling/templatize
  • git diff --check
  • Local validation against the paired ARO-Tools change

Special notes for your reviewer

Depends on Azure/ARO-Tools#228. Downstream sdp-pipelines rollout packaging is https://dev.azure.com/msazure/AzureRedHatOpenShift/_git/sdp-pipelines/pullrequest/15393796.

Required merge order: ARO-Tools first, then this ARO-HCP PR with the ARO-Tools pin bump, then sdp-pipelines with the ARO-Tools/ARO-HCP pin bumps and make -C hcp/ update run twice.

The Kusto role-assignment rename fixes a name collision between ingest and viewer grants. After first deployment, old grant-${guid(...)} Kusto role assignments may remain as harmless duplicates; ops can do a one-time cleanup if desired.

Copilot AI review requested due to automatic review settings April 14, 2026 21:56
@openshift-ci openshift-ci Bot requested review from geoberle and janboll April 14, 2026 21:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds ARO-HCP support to provision and clean up Azure Data Explorer (Kusto) data sources in the shared Managed Grafana workspace, integrated into the existing geography rollout flow.

Changes:

  • Extend Kusto IaC to optionally grant Grafana’s managed identity DB-level Viewer access and output the cluster URI.
  • Add geography pipeline steps to fetch global Grafana outputs and provision/update ADX Grafana datasources via Azure CLI.
  • Introduce configuration flags/schema + docs, and update the Kusto delete cleanup to remove the matching Grafana datasource.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
docs/monitoring.md Documents ADX/Kusto datasource provisioning gates, naming, validation, and teardown behavior.
dev-infrastructure/templates/kusto.bicep Threads optional Grafana resource ID into the Kusto module and exposes kustoUri output.
dev-infrastructure/modules/logs/kusto/main.bicep Grants Grafana MI Viewer on ServiceLogs when provided and surfaces cluster URI output.
dev-infrastructure/modules/logs/kusto/cluster.bicep Exposes Kusto cluster uri as an output.
dev-infrastructure/geography-pipeline.yaml Adds global output step and a shell step to create/update ADX Grafana datasources.
dev-infrastructure/configurations/kusto.tmpl.bicepparam Adds grafanaResourceId parameter placeholder for pipeline substitution.
dev-infrastructure/cleanup/delete.kusto.instance.sh Extends cleanup to also delete the corresponding Grafana datasource.
dev-infrastructure/cleanup/delete.kusto.instance.pipeline.yaml Wires Grafana resource ID from global outputs into the cleanup script run.
config/rendered/dev/swft/uksouth.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/rendered/dev/prow/westus3.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/rendered/dev/pers/westus3.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/rendered/dev/perf/westus3.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/rendered/dev/dev/westus3.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/rendered/dev/cspr/westus3.yaml Adds default monitoring flags for ADX datasource provisioning (disabled).
config/config.yaml Introduces new monitoring defaults for ADX provisioning flags.
config/config.schema.json Defines schema/validation for the new monitoring configuration knobs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev-infrastructure/geography-pipeline.yaml Outdated
Comment thread dev-infrastructure/geography-pipeline.yaml Outdated
Comment thread dev-infrastructure/configurations/kusto.tmpl.bicepparam
Comment thread dev-infrastructure/cleanup/delete.kusto.instance.sh Outdated
@swiencki swiencki changed the title ARO-22279: add ADX Grafana datasource provisioning ARO-26057: add ADX Grafana datasource provisioning Apr 15, 2026
@janboll
Copy link
Copy Markdown
Collaborator

janboll commented Apr 16, 2026

/hold
please use https://github.com/Azure/ARO-Tools/blob/main/tools/grafanactl/cmd/modify/cmd.go#L32
for doing modifications to grafana. Ideally we should update grafana ONCE in the pipeline, cause running an update on the instance will block it for several minutes.

@swiencki
Copy link
Copy Markdown
Collaborator Author

Thanks for the feedback, I'll move this to grafanactl in Azure/ARO-Tools.

On the "update once" concern: the existing modify datasource reconcile runs globally because UpdateGrafanaIntegrations is an ARM-level mutation that blocks the Grafana instance for several minutes. ADX datasources go through the Grafana REST API directly (POST/PUT /api/datasources), which is a lightweight config write, no ARM-level instance lock. So per-geography invocation is safe here. Is this good with you?

Copilot AI review requested due to automatic review settings April 28, 2026 18:45
@swiencki swiencki force-pushed the adx-grafana-datasource-upstream branch from 8d14303 to a80a076 Compare April 28, 2026 18:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dev-infrastructure/geography-pipeline.yaml
Copilot AI review requested due to automatic review settings April 28, 2026 19:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@swiencki
Copy link
Copy Markdown
Collaborator Author

/unhold

Updated to use grafanactl.

Comment on lines +112 to +117
go run ../tooling/grafanactl/main.go modify datasource reconcile-adx \
--grafana-resource-id "${GRAFANA_RESOURCE_ID}" \
--cluster-url "${ADX_CLUSTER_URL}" \
--default-database "${ADX_DATABASE}" \
--datasource-name "${DATASOURCE_NAME}" \
--data-consistency strongconsistency
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work in EV2, since we need the binary pre-built

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This path has been reworked to use the typed GrafanaDatasources action instead of rollout-time go run; the downstream sdp-pipelines PR packages the prebuilt grafanactl binary and emits the modify datasource reconcile --adx-* command. Current ARO-HCP CI is still expected to fail until Azure/ARO-Tools#228 lands and this PR bumps the ARO-Tools pin.

Copilot AI review requested due to automatic review settings April 30, 2026 21:02
@swiencki swiencki force-pushed the adx-grafana-datasource-upstream branch from 5d069ae to c091e9c Compare April 30, 2026 21:02
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: swiencki
Once this PR has been reviewed and has the lgtm label, please assign geoberle for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tooling/templatize/pkg/pipeline/grafana_datasources_test.go
swiencki added a commit that referenced this pull request May 8, 2026
…peline

Per PR #4878 review. Drops the new GrafanaDatasources call from
geography-pipeline.yaml and extends the existing add-grafana-datasource
in region-pipeline.yaml with the ADX block. clusterUrl sources from
regional kusto-lookup output. AzureMonitor reconcile preserved via the
ARO-Tools default-true. Triple-AND gate keeps the three feature flags
agreeing by construction. No Go/Bicep/config/test changes.
Copilot AI review requested due to automatic review settings May 8, 2026 15:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Comment thread docs/monitoring.md
Comment thread docs/monitoring.md
…peline

Per PR #4878 review. Drops the new GrafanaDatasources call from
geography-pipeline.yaml and extends the existing add-grafana-datasource
in region-pipeline.yaml with the ADX block. clusterUrl sources from
regional kusto-lookup output. AzureMonitor reconcile preserved via the
ARO-Tools default-true. Triple-AND gate keeps the three feature flags
agreeing by construction. No Go/Bicep/config/test changes.
@swiencki swiencki force-pushed the adx-grafana-datasource-upstream branch from ef9d6e8 to 656403e Compare May 8, 2026 16:09
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 8, 2026

@swiencki: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/cspr 8d14303 link true /test cspr
ci/prow/verify 656403e link true /test verify
ci/prow/config-change-detection 656403e link true /test config-change-detection
ci/prow/lint 656403e link true /test lint
ci/prow/e2e-parallel 656403e link true /test e2e-parallel
ci/prow/image-updater-images 656403e link true /test image-updater-images
ci/prow/e2e-images 656403e link true /test e2e-images
ci/prow/test-unit 656403e link true /test test-unit
ci/prow/images 656403e link true /test images

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants