feature: add metrics support for NamingServer #7882

contrueCT · 2025-12-20T12:57:51Z

I have read the CONTRIBUTING.md guidelines.
I have registered the PR changes.

Ⅰ. Describe what this PR did

Add metrics for NamingServer:

seata_namingserver_cluster_node_count (Gauge)
seata_namingserver_watcher_count (Gauge)
seata_namingserver_cluster_change_push_total (Counter)
JVM and HTTP-related metrics implemented via Micrometer and Spring Boot.

Ⅱ. Does this pull request fix one issue?

fixes #7852

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

codecov · 2025-12-20T13:40:35Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.62%. Comparing base (6160c7d) to head (1307fdc).

Additional details and impacted files

@@             Coverage Diff              @@
##                2.x    #7882      +/-   ##
============================================
+ Coverage     71.58%   71.62%   +0.03%     
  Complexity      883      883              
============================================
  Files          1294     1294              
  Lines         49554    49554              
  Branches       5884     5884              
============================================
+ Hits          35475    35493      +18     
+ Misses        11155    11142      -13     
+ Partials       2924     2919       -5

see 12 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

funky-eyes

Additionally, I suggest adding monitoring metrics for the response time (avg, p50, p99, p999) and QPS of each interface. You can implement this in the filter.

...ngserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerMetricsManager.java

namingserver/src/main/java/org/apache/seata/namingserver/manager/NamingManager.java

…rver' into add-metrics-support-for-NamingServer

…pNamingMetricsManager and PrometheusNamingMetricsManager

…requests

funky-eyes · 2025-12-23T01:14:28Z

...rver/src/main/java/org/apache/seata/namingserver/metrics/PrometheusNamingMetricsManager.java

It seems we're missing some common tags, such as the host of the current Naming Server. For example, this would make it easier to query metrics from specific Naming Server instances.

Copilot

Pull request overview

This PR adds comprehensive Prometheus metrics support to the NamingServer module. The implementation enables monitoring of critical server operations through Micrometer and Spring Boot Actuator.

Key Changes:

Introduces three core metrics: cluster node count (Gauge), watcher count (Gauge), and cluster change push notifications (Counter)
Adds configurable metrics support with a no-op implementation for when metrics are disabled
Integrates metrics collection into existing server operations for real-time monitoring

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`namingserver/pom.xml`	Adds Spring Boot Actuator and Micrometer Prometheus registry dependencies
`namingserver/src/main/resources/application.yml`	Configures metrics endpoint exposure and percentile distribution settings
`namingserver/src/test/resources/application.yml`	Configures test environment metrics settings
`namingserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerMetricsManager.java`	Defines the metrics manager interface with metric names and tag constants
`namingserver/src/main/java/org/apache/seata/namingserver/metrics/PrometheusNamingMetricsManager.java`	Implements Prometheus metrics collection using MultiGauge and Counter
`namingserver/src/main/java/org/apache/seata/namingserver/metrics/NoOpNamingMetricsManager.java`	Provides no-op implementation when metrics are disabled
`namingserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerTagsContributor.java`	Adds custom tags to HTTP request metrics for business dimensions
`namingserver/src/main/java/org/apache/seata/namingserver/manager/NamingManager.java`	Integrates cluster node count metrics refresh on instance registration/unregistration
`namingserver/src/main/java/org/apache/seata/namingserver/manager/ClusterWatcherManager.java`	Integrates watcher count metrics and cluster change push counter
`namingserver/src/test/java/org/apache/seata/namingserver/NamingServerMetricsManagerTest.java`	Adds comprehensive unit tests for metrics manager functionality
`namingserver/src/test/java/org/apache/seata/namingserver/ClusterWatcherManagerTest.java`	Updates tests to inject NoOpNamingMetricsManager
`changes/en-us/2.x.md`	Documents the feature addition in English changelog
`changes/zh-cn/2.x.md`	Documents the feature addition in Chinese changelog

Comments suppressed due to low confidence (1)

namingserver/src/main/java/org/apache/seata/namingserver/manager/ClusterWatcherManager.java:78

The scheduled task re-registers watchers within its execution which calls registryWatcher, and registryWatcher immediately triggers refreshWatcherCountMetrics. This creates a potential for frequent metrics refreshes every second as the scheduled task runs. This could cause unnecessary overhead when combined with the refresh on line 102.

        scheduledThreadPoolExecutor.scheduleAtFixedRate(
                () -> {
                    for (String group : WATCHERS.keySet()) {
                        Optional.ofNullable(WATCHERS.remove(group))
                                .ifPresent(watchers -> watchers.parallelStream().forEach(watcher -> {
                                    if (System.currentTimeMillis() >= watcher.getTimeout()) {
                                        notify(watcher, HttpStatus.NOT_MODIFIED.value());
                                    }
                                    if (!watcher.isDone()) {
                                        // Re-register
                                        registryWatcher(watcher);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-23T09:33:55Z

...gserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerTagsContributor.java

+        // Add namespace tag
+        String namespace = context.getCarrier().getParameter(TAG_NAMESPACE);
+        keyValues = keyValues.and(KeyValue.of(TAG_NAMESPACE, namespace != null ? namespace : UNKNOWN));
+
+        // Add cluster tag
+        String cluster = context.getCarrier().getParameter(TAG_CLUSTER);
+        if (cluster == null) {
+            cluster = context.getCarrier().getParameter("clusterName");
+        }
+        keyValues = keyValues.and(KeyValue.of(TAG_CLUSTER, cluster != null ? cluster : UNKNOWN));
+
+        // Add vgroup tag
+        String vgroup = context.getCarrier().getParameter(TAG_VGROUP);
+        if (vgroup == null) {
+            vgroup = context.getCarrier().getParameter("group");
+        }
+        keyValues = keyValues.and(KeyValue.of(TAG_VGROUP, vgroup != null ? vgroup : UNKNOWN));


Adding tags with value "unknown" for every HTTP request will create high cardinality metrics. When these parameters are not present in requests, it would be better to either not add the tag at all, or use a constant shared "unknown" value. This could lead to performance issues and excessive memory usage in the metrics registry as each unique combination of tag values creates a separate time series. Consider only adding tags when they have meaningful values.

The "unknown" value is a fixed string constant, not a dynamic value that changes per request. This won't cause cardinality explosion. The namespace, cluster, and vgroup parameters are also low-cardinality by nature in naming server scenarios (limited to dozens at most). Using "unknown" as a fallback is a standard practice that helps operators identify requests with missing parameters.

namingserver/src/main/java/org/apache/seata/namingserver/manager/ClusterWatcherManager.java

namingserver/src/main/java/org/apache/seata/namingserver/manager/NamingManager.java

...gserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerTagsContributor.java

namingserver/src/main/resources/application.yml

namingserver/src/test/resources/application.yml

namingserver/src/main/java/org/apache/seata/namingserver/manager/NamingManager.java

Co-authored-by: Copilot <[email protected]>

…ss) to prevent loading console module's WebSecurityConfig which causes ClassNotFoundException for WebSecurityConfigurerAdapter on Spring Security 6

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

...gserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerTagsContributor.java

...rver/src/main/java/org/apache/seata/namingserver/metrics/PrometheusNamingMetricsManager.java

...ngserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerMetricsManager.java

namingserver/src/test/java/org/apache/seata/namingserver/NamingServerMetricsManagerTest.java

…mingMetricsManager.java

contrueCT · 2026-01-05T05:44:11Z

It seems we're missing some common tags, such as the host of the current Naming Server. For example, this would make it easier to query metrics from specific Naming Server instances.

Additionally, I suggest adding monitoring metrics for the response time (avg, p50, p99, p999) and QPS of each interface. You can implement this in the filter.

Both implemented. Response time metrics (P50/P90/P95/P99/P999) use Spring Boot Actuator's http.server.requests with custom business tags via NamingServerTagsContributor. Common tag application added in application.yml.

funky-eyes · 2026-01-07T06:09:02Z

...gserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerTagsContributor.java

+
+    @Override
+    public KeyValues getLowCardinalityKeyValues(ServerRequestObservationContext context) {
+        KeyValues keyValues = super.getLowCardinalityKeyValues(context);


Is this only applicable to requests that include a namespace and vgroup? If so, how should the tags be added in the PrometheusNamingMetricsManager class? For example, when a vgroup under a specific namespace's cluster performs a watch operation, or when fetching cluster information.

funky-eyes · 2026-01-07T06:10:57Z

namingserver/src/main/java/org/apache/seata/namingserver/manager/ClusterWatcherManager.java

+                watchers.parallelStream().forEach(this::notify);
+                // Increment cluster change push counter
+                if (!watchers.isEmpty()) {
+                    metricsManager.incrementClusterChangePushCount(event.getGroup());


Here, only the "group" is included. Is it possible to also include the namespace and cluster information as tags?

funky-eyes · 2026-01-07T06:13:05Z

namingserver/src/main/resources/application.yml

+      percentiles:
+        http.server.requests: 0.5, 0.9, 0.95, 0.99, 0.999
+      percentiles-histogram:
+        http.server.requests: true


Suggested change

percentiles:

http.server.requests: 0.5, 0.9, 0.95, 0.99, 0.999

percentiles-histogram:

http.server.requests: true

percentiles:

http.server.requests: 0.5, 0.99, 0.999

I believe that for now, these three percentiles should suffice, and we should not enable percentiles-histogram, as it would result in excessive data being generated in Prometheus.

contrueCT and others added 3 commits December 20, 2025 19:38

feat: add metrics support for NamingServer

b12a808

changes logs

ad274b1

Merge branch '2.x' into add-metrics-support-for-NamingServer

81e2bcd

funky-eyes changed the title ~~feat: add metrics support for NamingServer~~ feature: add metrics support for NamingServer Dec 22, 2025

funky-eyes reviewed Dec 22, 2025

View reviewed changes

...ngserver/src/main/java/org/apache/seata/namingserver/metrics/NamingServerMetricsManager.java Show resolved Hide resolved

namingserver/src/main/java/org/apache/seata/namingserver/manager/NamingManager.java Show resolved Hide resolved

funky-eyes and others added 8 commits December 22, 2025 10:43

Merge branch '2.x' into add-metrics-support-for-NamingServer

227dc9d

changes logs

6b00afb

Merge remote-tracking branch 'origin/add-metrics-support-for-NamingSe…

1f3f03b

…rver' into add-metrics-support-for-NamingServer

feat: abstract NamingServerMetricsManager interface and implement NoO…

789a10f

…pNamingMetricsManager and PrometheusNamingMetricsManager

feat: add metrics support for NamingServer with custom tags for HTTP …

a181f46

…requests

Merge branch '2.x' into add-metrics-support-for-NamingServer

e09964f

feat: add metrics support for NamingServer with custom tags for HTTP …

4999aa7

…requests

feat: add metrics support for NamingServer with custom tags for HTTP …

d93af2f

…requests

funky-eyes reviewed Dec 23, 2025

View reviewed changes

funky-eyes requested a review from Copilot December 23, 2025 09:25

Copilot started reviewing on behalf of funky-eyes December 23, 2025 09:26 View session

Copilot AI reviewed Dec 23, 2025

View reviewed changes

funky-eyes and others added 5 commits December 27, 2025 20:52

Merge branch '2.x' into add-metrics-support-for-NamingServer

2dacccb

Update namingserver/src/main/resources/application.yml

5b59fcc

Co-authored-by: Copilot <[email protected]>

fix: correct application name syntax in metrics configuration

e0d5eb4

bugfix: Replace @SpringBootTest with @ExtendWith(MockitoExtension.cla…

140bc53

…ss) to prevent loading console module's WebSecurityConfig which causes ClassNotFoundException for WebSecurityConfigurerAdapter on Spring Security 6

Merge branch '2.x' into add-metrics-support-for-NamingServer

6d329a8

funky-eyes requested a review from Copilot January 5, 2026 03:34

Copilot started reviewing on behalf of funky-eyes January 5, 2026 03:34 View session

Copilot AI reviewed Jan 5, 2026

View reviewed changes

docs:add JavaDoc for NamingServerMetricsManager.java and PrometheusNa…

1307fdc

…mingMetricsManager.java

funky-eyes reviewed Jan 7, 2026

View reviewed changes

feature: add metrics support for NamingServer #7882

Are you sure you want to change the base?

feature: add metrics support for NamingServer #7882

Uh oh!

Conversation

contrueCT commented Dec 20, 2025 • edited by funky-eyes Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

Uh oh!

codecov bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

funky-eyes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

funky-eyes Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

contrueCT Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

contrueCT commented Jan 5, 2026

Uh oh!

funky-eyes Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

funky-eyes Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

funky-eyes Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

contrueCT commented Dec 20, 2025 •

edited by funky-eyes

Loading

codecov bot commented Dec 20, 2025 •

edited

Loading