Skip to content

feat: add recency indicator to most-flaky-tests report#5129

Open
dhiller wants to merge 3 commits into
kubevirt:mainfrom
dhiller:feat/flaky-report-recency-indicator
Open

feat: add recency indicator to most-flaky-tests report#5129
dhiller wants to merge 3 commits into
kubevirt:mainfrom
dhiller:feat/flaky-report-recency-indicator

Conversation

@dhiller

@dhiller dhiller commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Adds visual distinction between actively flaky tests and stale entries in the most-flaky-tests report, so reviewers can focus on tests that are currently failing.

Which issue(s) this PR fixes:

Depends on #5125

Special notes for your reviewer:

  • Unit tests verify HasRecentFailures flag is set correctly for recent and stale candidates
  • All quarantine tests pass, go vet clean

🤖 Assisted by Claude Code

The auto-quarantine bot previously only checked overall failure
percentage across 14 days without verifying whether failures are
recent. This caused tests that recovered days ago to still be
quarantined based on stale failure data.

Add a recency + consistency filter with three configurable parameters:
- --max-failure-age (default: 72h): only failures within this window
  count as recent
- --min-recent-failures (default: 2): minimum number of recent
  failures required per lane
- --min-failure-interval (default: 24h): minimum time span between
  recent failures to reject same-PR bursts

Ref: kubevirt/kubevirt#18066

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Hiller <dhiller@redhat.com>
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@kubevirt-bot

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/L labels Jun 10, 2026
@kubevirt-bot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tiraboschi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to filter test impacts based on the recency, count, and interval of build failures, adding new configuration options (max-failure-age, min-recent-failures, and min-failure-interval) to the auto-quarantine and flaky test reporting tools. The HTML report has also been updated to visually distinguish active and stale failures. The review feedback highlights a potential index out of range panic in HasMinRecentFailures when recentIntervals is empty, suggests adding validation to prevent negative values for the new configuration parameters, and recommends moving the instantiation of recentFailureFilter outside of the loop to avoid redundant allocations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/searchci/searchci.go Outdated
Comment on lines +173 to +175
if len(recentIntervals) < minCount {
return false
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If minCount is configured to 0 (or a negative value) and recentIntervals is empty, len(recentIntervals) < minCount evaluates to false. The code then proceeds to access recentIntervals[0] on line 179, which will cause an index out of range panic. Adding an explicit check for len(recentIntervals) == 0 prevents this panic.

Suggested change
if len(recentIntervals) < minCount {
return false
}
if len(recentIntervals) == 0 || len(recentIntervals) < minCount {
return false
}

Comment on lines +39 to +41
maxFailureAge time.Duration
minRecentFailures int
minFailureInterval time.Duration

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Please add validation for these new numeric configuration parameters to ensure they are not negative, preventing invalid configurations.

You can define a Validate method on autoQuarantineOptions and call it in the command runners (e.g., AutoQuarantine and MostFlakyTestsReport):

func (a autoQuarantineOptions) Validate() error {
	if a.maxFailureAge < 0 {
		return fmt.Errorf("max-failure-age cannot be negative")
	}
	if a.minRecentFailures < 0 {
		return fmt.Errorf("min-recent-failures cannot be negative")
	}
	if a.minFailureInterval < 0 {
		return fmt.Errorf("min-failure-interval cannot be negative")
	}
	return nil
}
References
  1. When validating numeric configuration parameters such as timeouts or durations, ensure that negative values are explicitly handled, for example by returning an error, to prevent invalid configurations.

// Filter impacts by the matching lane regex and required lane status
// Filter impacts by the matching lane regex, required lane status,
// and recency of failures
recentFailureFilter := searchci.HasMinRecentFailures(quarantineOpts.maxFailureAge, quarantineOpts.minRecentFailures, quarantineOpts.minFailureInterval)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

recentFailureFilter is independent of the loop variables and can be instantiated once outside the for _, topXTest := range topXTests loop (before line 177) to avoid redundant allocations on every iteration.

dhiller added 2 commits June 10, 2026 13:10
…ative minCount

When minCount is 0 or negative, `len(recentIntervals) < minCount`
evaluates to false even when recentIntervals is empty, causing an
out-of-bounds panic at `recentIntervals[0]`. Add an explicit
`len(recentIntervals) == 0` guard.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Hiller <dhiller@redhat.com>
Tests with recent consistent failures (active flakes) are now visually
distinguished from tests with only stale failures in the report. Active
flakes get a red left border and "active" badge, while stale entries
are dimmed with a gray border and "stale" badge.

Uses the same HasMinRecentFailures filter from the auto-quarantine
recency check, with identical configurable parameters exposed as
report command flags.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Hiller <dhiller@redhat.com>
@dhiller dhiller force-pushed the feat/flaky-report-recency-indicator branch from efe8c44 to 1e565f8 Compare June 10, 2026 11:10
@dhiller dhiller marked this pull request as ready for review June 10, 2026 11:28
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@kubevirt-bot kubevirt-bot requested review from Whitedyl and enp0s3 June 10, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants