Skip to content

fix: reduce listening state log spam#7905

Open
samrusani wants to merge 7 commits into
tari-project:developmentfrom
samrusani:fix/reduce-listening-up-to-date-log-spam-6808
Open

fix: reduce listening state log spam#7905
samrusani wants to merge 7 commits into
tari-project:developmentfrom
samrusani:fix/reduce-listening-up-to-date-log-spam-6808

Conversation

@samrusani

@samrusani samrusani commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Description

Reduces repeated listening-state chain metadata debug logs by aggregating peer metadata updates in the listening loop.

The listening state now collects peer chain metadata and emits a single DEBUG summary on a timer, and when leaving the listening state. Per-peer status details are still available at TRACE, so the useful signal remains without spamming debug logs. Sync decisions are unchanged.

Motivation and Context

Fixes #6808.

The previous debug log was emitted from determine_sync_mode for every peer metadata update while the node was up to date, which could create many identical debug entries.

How Has This Been Tested?

  • git diff --check
  • cargo test --locked --ignore-rust-version -p tari_core base_node::state_machine_service::states::listening --lib
  • cargo check --locked --ignore-rust-version -p tari_core

What process can a PR reviewer use to test or verify this change?

Run the focused tari_core unit tests above and inspect base_layer/core/src/base_node/state_machine_service/states/listening.rs to confirm peer chain metadata is aggregated into one timed DEBUG summary and cleared after each log.

Breaking Changes

  • None
  • Requires data directory on base node to be deleted
  • Requires hard fork
  • Other - Please specify

@samrusani samrusani requested a review from a team as a code owner June 27, 2026 19:16

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces deduplication for the chain status logs in the Listening state of the base node state machine, ensuring that status updates are logged at the Debug level only when there is a change, and at the Trace level otherwise. The reviewer identified a critical issue where tracking both local and network metadata in the cached state would cause log spam when connected to multiple peers with slightly different heights. To resolve this, the reviewer suggested tracking only the local chain tip metadata and provided code suggestions to update the state struct, helper methods, and unit tests accordingly.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread base_layer/core/src/base_node/state_machine_service/states/listening.rs Outdated
Comment on lines +167 to +175
fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata, network: &ChainMetadata) -> bool {
let log_state = ChainStatusLogState::new(local, network);
if self.last_chain_status_log == Some(log_state) {
return false;
}

self.last_chain_status_log = Some(log_state);
true
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update should_log_chain_status_at_debug to only accept and track the local metadata.

Suggested change
fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata, network: &ChainMetadata) -> bool {
let log_state = ChainStatusLogState::new(local, network);
if self.last_chain_status_log == Some(log_state) {
return false;
}
self.last_chain_status_log = Some(log_state);
true
}
fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata) -> bool {
let log_state = ChainStatusLogState::new(local);
if self.last_chain_status_log == Some(log_state) {
return false;
}
self.last_chain_status_log = Some(log_state);
true
}

Comment thread base_layer/core/src/base_node/state_machine_service/states/listening.rs Outdated

@SWvheerden SWvheerden left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good attempt at solving this, but I think we can do better gemini has some good concerns.

I think we should do this rather:
Change it to be more time based logging.
On the main loop:

 loop {
            let metadata_event = shared.metadata_event_stream.recv().await;

this we can change to

loop{
tokio::select!(
timer => 
shared.metadata_event_stream.recv() => 
}

As long as we are in sync with the network we can create a log that looks like this:
We are in sync with the network (height, diff), we have lagging peers {(node id, height, diff)..}
or if are in sync with all
We are in sync with the network with peers {peer id..}
or
if all are ahead
We are ahead(height, diff) with lagging peers {(node id, height, diff)..}
If we are behind
`We are in behind the network (height, diff) with peers: {(node id, height, diff)..}

each time we log, we process all results received log the result, then we clear the list
if we go out of listing state, we do a log process as well.

@samrusani samrusani force-pushed the fix/reduce-listening-up-to-date-log-spam-6808 branch 2 times, most recently from b57544b to 08b3c6d Compare June 29, 2026 11:36

@SWvheerden SWvheerden left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, just a few small changes I think

fn record(&mut self, local: &ChainMetadata, peer_metadata: &PeerChainMetadata) {
self.local_tip = Some(ChainStatusTipLog::new(local));
let peer_log = ChainStatusPeerLog::new(peer_metadata);
if let Some(existing) = self.peers.iter_mut().find(|peer| peer.node_id == peer_log.node_id) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a hashmap here, then you dont have to iterate over a vec

}

Some(format!(
"We are in sync with the network ({local_tip}), we have lagging peers {{{}}}",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the number of in sync peers here, we dont have to diplay their tips, but just the number we have

@samrusani samrusani force-pushed the fix/reduce-listening-up-to-date-log-spam-6808 branch from 08b3c6d to bf7556c Compare June 30, 2026 15:36
@samrusani

Copy link
Copy Markdown
Contributor Author

Thanks, I pushed follow-ups for the remaining review items.

Changes made:

  • Replaced the peer status collection with a HashMap keyed by peer node id.
  • Added the in-sync peer count to the mixed in-sync/lagging summary instead of listing in-sync peer tips.
  • Moved metadata event processing into handle_metadata_event and tightened the select branch so the loop delegates to the helper.

Validation run locally:

  • cargo test --locked --ignore-rust-version -p tari_core base_node::state_machine_service::states::listening --lib
  • cargo check --locked --ignore-rust-version -p tari_core
  • git diff --check

The visible GitHub checks are passing and the branch is up to date with development.

@samrusani

Copy link
Copy Markdown
Contributor Author

Pushed follow-up commit 19460237f for the latest review feedback.

Changes:

  • Switched the initial-sync peer collection to a keyed map so repeated peer metadata updates replace by peer id without scanning a vec.
  • Added the in-sync peer count to the ahead-with-lagging-peers summary.
  • Extracted peer metadata handling out of the tokio::select! branch to keep the listening loop flatter.

Validation:

  • cargo +1.93.0 test --manifest-path /Users/samirusani/Desktop/Codex/Tari/tari/Cargo.toml -p tari_core test_chain_status_log --lib -> 4 passed
  • cargo +1.93.0 check --manifest-path /Users/samirusani/Desktop/Codex/Tari/tari/Cargo.toml -p tari_core -> passed
  • git diff --check -> passed

@SWvheerden

Copy link
Copy Markdown
Collaborator

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the base node's listening state in listening.rs by extracting event handling logic into helper methods and introducing a ListeningLoopState struct. It also adds a periodic logging mechanism (ChainStatusLog) to aggregate and log peer chain status every 30 seconds, reduces log verbosity by changing several log levels to trace!, and adds comprehensive unit tests. There are no review comments, so no feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

SWvheerden
SWvheerden previously approved these changes Jul 1, 2026
@SWvheerden

Copy link
Copy Markdown
Collaborator

@samrusani just check ci
Make sure cargo ci-clippy passes and cargo +nightly fmt --all

@samrusani

Copy link
Copy Markdown
Contributor Author

Pushed commit b2bfe4f51 to address the CI clippy failure.

What changed:

  • Split handle_peer_chain_metadata into smaller helpers so the listening-state metadata path stays below the clippy::too_many_lines limit.
  • Kept the chain-status behavior unchanged; this is a refactor for the CI lint failure.

Validation:

  • cargo +nightly fmt --all -> passed
  • cargo +1.93.0 ci-clippy -> passed
  • cargo +1.93.0 test --manifest-path /Users/samirusani/Desktop/Codex/Tari/tari/Cargo.toml -p tari_core test_chain_status_log --lib -> 4 passed
  • cargo +1.93.0 check --manifest-path /Users/samirusani/Desktop/Codex/Tari/tari/Cargo.toml -p tari_core -> passed
  • git diff --check -> passed

The signed-commit check is green on the new head; GitGuardian is still pending at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

improve log

2 participants