fix: reduce listening state log spam#7905
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces deduplication for the chain status logs in the Listening state of the base node state machine, ensuring that status updates are logged at the Debug level only when there is a change, and at the Trace level otherwise. The reviewer identified a critical issue where tracking both local and network metadata in the cached state would cause log spam when connected to multiple peers with slightly different heights. To resolve this, the reviewer suggested tracking only the local chain tip metadata and provided code suggestions to update the state struct, helper methods, and unit tests accordingly.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata, network: &ChainMetadata) -> bool { | ||
| let log_state = ChainStatusLogState::new(local, network); | ||
| if self.last_chain_status_log == Some(log_state) { | ||
| return false; | ||
| } | ||
|
|
||
| self.last_chain_status_log = Some(log_state); | ||
| true | ||
| } |
There was a problem hiding this comment.
Update should_log_chain_status_at_debug to only accept and track the local metadata.
| fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata, network: &ChainMetadata) -> bool { | |
| let log_state = ChainStatusLogState::new(local, network); | |
| if self.last_chain_status_log == Some(log_state) { | |
| return false; | |
| } | |
| self.last_chain_status_log = Some(log_state); | |
| true | |
| } | |
| fn should_log_chain_status_at_debug(&mut self, local: &ChainMetadata) -> bool { | |
| let log_state = ChainStatusLogState::new(local); | |
| if self.last_chain_status_log == Some(log_state) { | |
| return false; | |
| } | |
| self.last_chain_status_log = Some(log_state); | |
| true | |
| } |
SWvheerden
left a comment
There was a problem hiding this comment.
this is a good attempt at solving this, but I think we can do better gemini has some good concerns.
I think we should do this rather:
Change it to be more time based logging.
On the main loop:
loop {
let metadata_event = shared.metadata_event_stream.recv().await;
this we can change to
loop{
tokio::select!(
timer =>
shared.metadata_event_stream.recv() =>
}
As long as we are in sync with the network we can create a log that looks like this:
We are in sync with the network (height, diff), we have lagging peers {(node id, height, diff)..}
or if are in sync with all
We are in sync with the network with peers {peer id..}
or
if all are ahead
We are ahead(height, diff) with lagging peers {(node id, height, diff)..}
If we are behind
`We are in behind the network (height, diff) with peers: {(node id, height, diff)..}
each time we log, we process all results received log the result, then we clear the list
if we go out of listing state, we do a log process as well.
b57544b to
08b3c6d
Compare
SWvheerden
left a comment
There was a problem hiding this comment.
This is looking good, just a few small changes I think
| fn record(&mut self, local: &ChainMetadata, peer_metadata: &PeerChainMetadata) { | ||
| self.local_tip = Some(ChainStatusTipLog::new(local)); | ||
| let peer_log = ChainStatusPeerLog::new(peer_metadata); | ||
| if let Some(existing) = self.peers.iter_mut().find(|peer| peer.node_id == peer_log.node_id) { |
There was a problem hiding this comment.
use a hashmap here, then you dont have to iterate over a vec
| } | ||
|
|
||
| Some(format!( | ||
| "We are in sync with the network ({local_tip}), we have lagging peers {{{}}}", |
There was a problem hiding this comment.
add the number of in sync peers here, we dont have to diplay their tips, but just the number we have
08b3c6d to
bf7556c
Compare
|
Thanks, I pushed follow-ups for the remaining review items. Changes made:
Validation run locally:
The visible GitHub checks are passing and the branch is up to date with |
|
Pushed follow-up commit Changes:
Validation:
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the base node's listening state in listening.rs by extracting event handling logic into helper methods and introducing a ListeningLoopState struct. It also adds a periodic logging mechanism (ChainStatusLog) to aggregate and log peer chain status every 30 seconds, reduces log verbosity by changing several log levels to trace!, and adds comprehensive unit tests. There are no review comments, so no feedback is provided.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
@samrusani just check ci |
|
Pushed commit What changed:
Validation:
The signed-commit check is green on the new head; GitGuardian is still pending at the moment. |
Description
Reduces repeated listening-state chain metadata debug logs by aggregating peer metadata updates in the listening loop.
The listening state now collects peer chain metadata and emits a single
DEBUGsummary on a timer, and when leaving the listening state. Per-peer status details are still available atTRACE, so the useful signal remains without spamming debug logs. Sync decisions are unchanged.Motivation and Context
Fixes #6808.
The previous debug log was emitted from
determine_sync_modefor every peer metadata update while the node was up to date, which could create many identical debug entries.How Has This Been Tested?
git diff --checkcargo test --locked --ignore-rust-version -p tari_core base_node::state_machine_service::states::listening --libcargo check --locked --ignore-rust-version -p tari_coreWhat process can a PR reviewer use to test or verify this change?
Run the focused
tari_coreunit tests above and inspectbase_layer/core/src/base_node/state_machine_service/states/listening.rsto confirm peer chain metadata is aggregated into one timedDEBUGsummary and cleared after each log.Breaking Changes