feat: add recency indicator to most-flaky-tests report#5129
Conversation
The auto-quarantine bot previously only checked overall failure percentage across 14 days without verifying whether failures are recent. This caused tests that recovered days ago to still be quarantined based on stale failure data. Add a recency + consistency filter with three configurable parameters: - --max-failure-age (default: 72h): only failures within this window count as recent - --min-recent-failures (default: 2): minimum number of recent failures required per lane - --min-failure-interval (default: 24h): minimum time span between recent failures to reject same-PR bursts Ref: kubevirt/kubevirt#18066 Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Daniel Hiller <dhiller@redhat.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to filter test impacts based on the recency, count, and interval of build failures, adding new configuration options (max-failure-age, min-recent-failures, and min-failure-interval) to the auto-quarantine and flaky test reporting tools. The HTML report has also been updated to visually distinguish active and stale failures. The review feedback highlights a potential index out of range panic in HasMinRecentFailures when recentIntervals is empty, suggests adding validation to prevent negative values for the new configuration parameters, and recommends moving the instantiation of recentFailureFilter outside of the loop to avoid redundant allocations.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if len(recentIntervals) < minCount { | ||
| return false | ||
| } |
There was a problem hiding this comment.
If minCount is configured to 0 (or a negative value) and recentIntervals is empty, len(recentIntervals) < minCount evaluates to false. The code then proceeds to access recentIntervals[0] on line 179, which will cause an index out of range panic. Adding an explicit check for len(recentIntervals) == 0 prevents this panic.
| if len(recentIntervals) < minCount { | |
| return false | |
| } | |
| if len(recentIntervals) == 0 || len(recentIntervals) < minCount { | |
| return false | |
| } |
| maxFailureAge time.Duration | ||
| minRecentFailures int | ||
| minFailureInterval time.Duration |
There was a problem hiding this comment.
Please add validation for these new numeric configuration parameters to ensure they are not negative, preventing invalid configurations.
You can define a Validate method on autoQuarantineOptions and call it in the command runners (e.g., AutoQuarantine and MostFlakyTestsReport):
func (a autoQuarantineOptions) Validate() error {
if a.maxFailureAge < 0 {
return fmt.Errorf("max-failure-age cannot be negative")
}
if a.minRecentFailures < 0 {
return fmt.Errorf("min-recent-failures cannot be negative")
}
if a.minFailureInterval < 0 {
return fmt.Errorf("min-failure-interval cannot be negative")
}
return nil
}References
- When validating numeric configuration parameters such as timeouts or durations, ensure that negative values are explicitly handled, for example by returning an error, to prevent invalid configurations.
| // Filter impacts by the matching lane regex and required lane status | ||
| // Filter impacts by the matching lane regex, required lane status, | ||
| // and recency of failures | ||
| recentFailureFilter := searchci.HasMinRecentFailures(quarantineOpts.maxFailureAge, quarantineOpts.minRecentFailures, quarantineOpts.minFailureInterval) |
…ative minCount When minCount is 0 or negative, `len(recentIntervals) < minCount` evaluates to false even when recentIntervals is empty, causing an out-of-bounds panic at `recentIntervals[0]`. Add an explicit `len(recentIntervals) == 0` guard. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Daniel Hiller <dhiller@redhat.com>
Tests with recent consistent failures (active flakes) are now visually distinguished from tests with only stale failures in the report. Active flakes get a red left border and "active" badge, while stale entries are dimmed with a gray border and "stale" badge. Uses the same HasMinRecentFailures filter from the auto-quarantine recency check, with identical configurable parameters exposed as report command flags. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Daniel Hiller <dhiller@redhat.com>
efe8c44 to
1e565f8
Compare
What this PR does / why we need it:
Adds visual distinction between actively flaky tests and stale entries in the most-flaky-tests report, so reviewers can focus on tests that are currently failing.
HasMinRecentFailuresfilter from fix: require recent consistent failures for auto-quarantine #5125 with the same configurable flagsWhich issue(s) this PR fixes:
Depends on #5125
Special notes for your reviewer:
HasRecentFailuresflag is set correctly for recent and stale candidatesgo vetclean🤖 Assisted by Claude Code