Skip to content

Fix address-reviews duplicate detection and add real-data evals#558

Open
bryan-cox wants to merge 2 commits into
openshift-eng:mainfrom
bryan-cox:fix/address-reviews-dedup-and-evals
Open

Fix address-reviews duplicate detection and add real-data evals#558
bryan-cox wants to merge 2 commits into
openshift-eng:mainfrom
bryan-cox:fix/address-reviews-dedup-and-evals

Conversation

@bryan-cox

@bryan-cox bryan-cox commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add github-actions and github-actions[bot] to BOT_SIGNATURES in check_replied.py so the skill's dedup check recognizes replies posted under the GHA identity
  • Add CI override instruction in address-reviews.md so the skill respects "Do NOT push" directives from the periodic job harness instead of attempting to push (which conflicts with the harness's separate push phase)
  • Add 14 eval test cases using real PR data from openshift/hypershift PR #8535

Context

The periodic review agent was producing 102+ duplicate replies because:

  1. github-actions wasn't recognized as a bot in check_replied.py, so the skill-level dedup check didn't catch existing replies posted under that identity
  2. The skill's push step conflicted with the periodic job's "Do NOT push" instruction, leading to "changed but didn't push" reports

The eval suite validates these fixes using actual review thread data from the duplicate reply storm on PR #8535, covering duplicate prevention, comment categorization, prioritization, bot filtering, reply format, CI push override, and response rules. All 14 tests pass at 100%.

Companion PR: openshift/release for the periodic job's BOT_ACCOUNTS fix.

Test plan

  • make eval-plugins EVAL_PLUGIN=utils — 14/14 passing (100%)
  • make lint — passes
  • Monitor periodic job runs after both PRs merge to confirm duplicate replies stop

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added a CI override rule for address-reviews that avoids manual push and push verification when pushing is managed automatically.
  • Bug Fixes
    • Improved detection of already-processed bot replies by recognizing GitHub Actions-authored comments.
  • Tests
    • Added/updated extensive address-reviews evaluation cases and tightened expected reply/decision behavior checks.
  • Chores
    • Bumped the utils plugin version to 0.0.12.
    • Added utils budget tracking for evaluations.

@openshift-ci openshift-ci Bot requested review from brandisher and stleerh June 18, 2026 18:52
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2026
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Extends the address-reviews skill with a CI push-override rule that skips manual pushing when the prompt disables it, and expands bot detection to include GitHub Actions accounts. Introduces a comprehensive 14-case eval suite with Python judges validating categorization, reply format, and decision flags, plus an LLM quality rubric. Increments the plugin version to 0.0.12 and adds a 14.00 eval budget entry.

Changes

address-reviews skill behavior and evaluation suite

Layer / File(s) Summary
Skill behavior: CI override and bot detection
plugins/utils/commands/address-reviews.md, plugins/utils/scripts/check_replied.py
address-reviews.md adds a CI Override rule at Step 4 that skips push and push-verification when the prompt says "Do NOT push" or indicates CI handles pushing automatically. check_replied.py adds "github-actions" and "github-actions[bot]" to BOT_SIGNATURES so is_bot_reply() treats those accounts as already-replied bots.
Eval framework: configuration, wiring, and judges
plugins/utils/evals/eval-address-reviews.yaml
New 280-line eval config wires the utils:address-reviews skill in case mode with Claude Code runner and claude-opus-4-6 judge. Defines dataset/annotation schemas, analysis.json output contract, and Python judges for category, should_reply, should_change_code, should_push prediction validation; adds reply_format_correct judge for prefix/footer/length constraints; includes analysis_quality LLM rubric (1–5) with per-judge pass thresholds.
Eval cases 001–002: duplicate bot reply detection
plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/..., case-002-duplicate-extreme/...
Two cases verifying no additional reply when bots have already replied. Case 001 covers normal duplication (GitHub Actions + bot replies on outdated thread); case 002 covers extreme (28 bot replies on resolved/outdated thread). Each case includes input prompt, annotations with expected flags, and reference analysis.
Eval cases 003–008: comment categorization
plugins/utils/evals/cases/address-reviews/case-003-*/... through case-008-*/...
Six cases covering all primary categorization paths: QUESTION (×2 different phrasings), CHANGE_REQUEST (×2 different contexts), SUGGESTION, and ACTION_INSTRUCTION. Each case specifies input prompt, expected/acceptable categories, evaluation expectations, and grounded reference analysis.
Eval cases 009–014: prioritization, filtering, format, and response types
plugins/utils/evals/cases/address-reviews/case-009-*/... through case-014-*/...
Six behavioral cases: mixed-comment A→B→C priority ordering, Coderabbit AI comment kept (actionable bug), reply format enforcement (Done. prefix + footer + 3-sentence max), CI push-override validation (expected_should_push: false), question-only response (no code change), and imperative-instruction response (code change required).

Plugin version bump and eval budget

Layer / File(s) Summary
Version 0.0.12 and utils eval budget
plugins/utils/.claude-plugin/plugin.json, .claude-plugin/marketplace.json, docs/index.html, evals/budget.yaml
Plugin version incremented from 0.0.11 to 0.0.12 in manifest and marketplace files. evals/budget.yaml adds budgets.utils section with allowed: 14.00 and current: 14.00 budget limits.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • openshift-eng/ai-helpers#301: Introduces bot-detection and duplicate-prevention logic in check_replied.py that this PR extends by adding GitHub Actions account recognition.
  • openshift-eng/ai-helpers#354: Modifies the address-reviews push and verification control flow at Step 4, which this PR augments with the CI override rule.
  • openshift-eng/ai-helpers#464: Establishes the eval budgeting model in evals/budget.yaml that this PR extends by adding the new budgets.utils entry.

Suggested labels

lgtm

Suggested reviewers

  • bentito
  • stleerh

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 errors)

Check name Status Explanation Resolution
No Assumed Git Remote Names ❌ Error The PR's address-reviews.md contains 4 hardcoded 'origin' git remote names in commands (lines 170, 172, 182, 269) without using the discovered BASE_REMOTE variable, violating the no-assumed-remote-... Replace hardcoded 'origin' with '$BASE_REMOTE' variable: update lines 170, 172, 182, 269 to use the already-discovered BASE_REMOTE variable instead of hardcoding 'origin'.
Git Push Safety Rules ❌ Error PR violates Git Push Safety Rules: address-reviews.md line 264 uses prohibited force push variant git push --force-with-lease, and performs autonomous push without asking user permission first (o... Replace git push --force-with-lease with standard git push and add explicit user confirmation step before pushing, per safety rule requiring user permission before any push.
✅ Passed checks (8 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly reflects the two main changes: fixing duplicate detection in address-reviews and adding evaluation cases with real data.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No Real People Names In Style References ✅ Passed No real people names are used as style references in the PR. Author fields in eval test cases are metadata from actual PR #8535 comments being tested, not style guides.
No Untrusted Mcp Servers ✅ Passed PR contains no MCP server installations or untrusted package sources. Changes add GitHub-standard account identifiers to bot detection and 14 YAML test cases only.
Ai-Helpers Overlap Detection ✅ Passed PR modifies two ai-helpers files (address-reviews.md and check-release-health.md). Semantic analysis shows address-reviews is distinct from code-review and pre-commit-review commands (respond to co...
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/utils/commands/address-reviews.md (1)

259-269: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Push flow still lacks required safety gates.

Line 259 adds CI opt-out, but the default path still allows unconditional force-push behavior and origin-assumed verification. This command must require explicit user approval before any push, reject main/master targets, reject force variants, and discover remotes before push/verification.

Suggested patch
-**CI Override**: If the user prompt explicitly says "Do NOT push" or "pushing will be handled automatically", skip Steps 4b and 4c entirely. The CI harness manages pushing separately.
+**CI Override**: If the user prompt explicitly says "Do NOT push" or "pushing will be handled automatically", skip Steps 4b and 4c entirely. The CI harness manages pushing separately.
+
+**Mandatory Push Gate**: If CI override is not active, ask for explicit user approval before pushing:
+`Replies are posted and commits are ready. Push now? (yes/no)`
+If the user does not explicitly approve, skip Steps 4b and 4c.

 After all replies are posted, push all committed changes in a single push:
 
 ```bash
-git push --force-with-lease
+CURRENT_BRANCH=$(git branch --show-current)
+case "$CURRENT_BRANCH" in
+  main|master|*/main|*/master)
+    echo "Refusing push to protected branch: $CURRENT_BRANCH" >&2
+    exit 1
+    ;;
+esac
+TARGET_REMOTE=$(git branch -vv | awk '/^\*/ {print $3}' | cut -d/ -f1)
+if [ -z "$TARGET_REMOTE" ]; then
+  TARGET_REMOTE=$(git remote -v | awk 'NR==1 {print $1}')
+fi
+git push "$TARGET_REMOTE" "$CURRENT_BRANCH"

4c. Verify push

-- Run git log -1 --format='%H' locally and git ls-remote origin <branch> to confirm the remote has your commit
+- Run git log -1 --format='%H' locally and git ls-remote "$TARGET_REMOTE" "$CURRENT_BRANCH" to confirm the remote has your commit

</details>

As per coding guidelines: commands/skills must ask before pushing, must never force-push, must never push main/master, and must discover remotes instead of assuming names.

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @plugins/utils/commands/address-reviews.md around lines 259 - 269, The push
flow in the address-reviews.md file (specifically in section 4b where git push
--force-with-lease is used) lacks required safety gates. Replace the force-push
command with a safer implementation that: first rejects pushes to protected
branches (main, master, and their variants) by checking the current branch name
and exiting if it matches these patterns, dynamically discovers the target
remote by querying the branch tracking info or falling back to the first remote
from git remote output instead of assuming "origin", performs a regular push to
the discovered remote and current branch without force flags, and updates the
verification step in section 4c to use the dynamically discovered TARGET_REMOTE
and CURRENT_BRANCH variables instead of hardcoded "origin" and branch name in
both the git log and git ls-remote commands.


</details>

<!-- cr-comment:v1:e59874e9f69bca2e5f5d9346 -->

_Source: Coding guidelines_

</blockquote></details>

</blockquote></details>
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/openshift/commands/api-review.md`:
- Line 10: Three fenced code blocks in the file lack language specification: the
code block containing `/openshift:api-review [pr_url]` at line 10 (Step 1
prefix), the code block containing the example URL at line 212, and the code
block at line 218. Add `bash` language specification to each of these three code
blocks by changing the opening triple backticks from ``` to ```bash to comply
with the markdownlint enforcement in the repository.
- Around line 80-84: The code block that adds a git remote blindly assumes the
"upstream" name is available without checking if it already exists. Before
executing the git remote add upstream command, check whether a remote named
"upstream" already exists using git remote. If it already exists, either use a
different remote name, prompt the user to specify an alternative, or dynamically
discover an available remote name. Only proceed with adding the remote if the
name is confirmed to be available or use an alternative that does not conflict
with existing remotes.

---

Outside diff comments:
In `@plugins/utils/commands/address-reviews.md`:
- Around line 259-269: The push flow in the address-reviews.md file
(specifically in section 4b where git push --force-with-lease is used) lacks
required safety gates. Replace the force-push command with a safer
implementation that: first rejects pushes to protected branches (main, master,
and their variants) by checking the current branch name and exiting if it
matches these patterns, dynamically discovers the target remote by querying the
branch tracking info or falling back to the first remote from git remote output
instead of assuming "origin", performs a regular push to the discovered remote
and current branch without force flags, and updates the verification step in
section 4c to use the dynamically discovered TARGET_REMOTE and CURRENT_BRANCH
variables instead of hardcoded "origin" and branch name in both the git log and
git ls-remote commands.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9a91e115-2a60-4552-a5de-af2613f50319

📥 Commits

Reviewing files that changed from the base of the PR and between 1015dd8 and 1e928be.

📒 Files selected for processing (4)
  • plugins/openshift/commands/api-review.md
  • plugins/utils/commands/address-reviews.md
  • plugins/utils/evals/address-reviews.yaml
  • plugins/utils/scripts/check_replied.py

Comment thread plugins/openshift/commands/api-review.md Outdated
Comment thread plugins/openshift/commands/api-review.md Outdated
@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch 3 times, most recently from c6ea632 to 18c0496 Compare June 18, 2026 19:07
@bryan-cox

Copy link
Copy Markdown
Contributor Author

/test ?

@bryan-cox

Copy link
Copy Markdown
Contributor Author

/test eval-address-reviews


Author: JoelSpeed
Comment: "We are moving away from crd-schema-checker towards the upstream crdify project, and in fact, in o/api we have disabled the majority of checks in crd-schema-checker because they are implemented in crdify already. Could you weigh in on whether to switch this PR to crdify?"
Type: issue_comment (general PR comment)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you intentionally adding promptfoo coverage here vs eval harness?

@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch from 18c0496 to 9738f1b Compare June 22, 2026 12:55

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/utils/evals/eval-address-reviews.yaml`:
- Around line 155-201: The review points out that schema fields
expected_should_filter and expected_priority_order are defined but lack
corresponding evaluation checks, unlike should_change_code_correct and
should_push_correct. Create two new evaluation checks following the same pattern
as the existing ones: add a check named should_filter_correct that compares the
predicted should_filter value from the analysis.json file against the
expected_should_filter annotation, and add a check named priority_order_correct
that compares the predicted priority_order value from the analysis.json file
against the expected_priority_order annotation. Both checks should reuse the
same JSON parsing and file retrieval logic as the existing checks and return
appropriate pass/fail messages with the compared values.
- Around line 99-103: The valid_output_json function's required fields list is
incomplete and does not match the full output contract defined in the file.
Update the required list to include all required fields specified in the output
contract definition (lines 19-26 and 68-75), which should include: category,
should_reply, should_change_code, rationale, should_push, should_filter,
priority_order, and reply_draft. This ensures the validation properly enforces
the complete output schema and prevents malformed analysis.json files from
passing validation.
- Around line 203-233: The reply_format_correct check is using incorrect
annotation key names that don't match the actual annotations used. Instead of
checking for expected_starts_with and expected_contains_footer, update the check
to look for expected_reply_contains and expected_reply_max_sentences keys.
Additionally, add validation logic to count the number of sentences in the
reply_draft and compare it against the expected_reply_max_sentences constraint,
returning an error if the sentence count exceeds the maximum allowed, and
include this validation result in the final error messages along with the other
format checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 37039e58-4cea-4adb-9e7b-d699c1d4917e

📥 Commits

Reviewing files that changed from the base of the PR and between 18c0496 and 9738f1b.

📒 Files selected for processing (35)
  • .claude-plugin/marketplace.json
  • docs/index.html
  • evals/budget.yaml
  • plugins/utils/.claude-plugin/plugin.json
  • plugins/utils/commands/address-reviews.md
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/input.yaml
  • plugins/utils/evals/eval-address-reviews.yaml
  • plugins/utils/scripts/check_replied.py
✅ Files skipped from review due to trivial changes (25)
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/input.yaml
  • plugins/utils/.claude-plugin/plugin.json
  • plugins/utils/commands/address-reviews.md
  • .claude-plugin/marketplace.json
  • docs/index.html
🚧 Files skipped from review as they are similar to previous changes (2)
  • evals/budget.yaml
  • plugins/utils/scripts/check_replied.py

Comment thread plugins/utils/evals/eval-address-reviews.yaml Outdated
Comment thread plugins/utils/evals/eval-address-reviews.yaml
Comment thread plugins/utils/evals/eval-address-reviews.yaml
@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch from 9738f1b to 7214004 Compare June 22, 2026 13:30

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/reference-analysis.json`:
- Line 2: The category field in reference-analysis.json is incorrectly set to
"CHANGE_REQUEST" when it should be "SUGGESTION" to align with the
case-007-categorize-suggestion test objective. Update the "category" value on
line 2 and line 9 from "CHANGE_REQUEST" to "SUGGESTION" so the test case
properly validates suggestion categorization instead of conflicting with a
different category type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c1d37ae8-fd3e-4ac5-a31e-320c84edfe43

📥 Commits

Reviewing files that changed from the base of the PR and between 9738f1b and 7214004.

📒 Files selected for processing (49)
  • .claude-plugin/marketplace.json
  • docs/index.html
  • evals/budget.yaml
  • plugins/utils/.claude-plugin/plugin.json
  • plugins/utils/commands/address-reviews.md
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/reference-analysis.json
  • plugins/utils/evals/eval-address-reviews.yaml
  • plugins/utils/scripts/check_replied.py
✅ Files skipped from review due to trivial changes (37)
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/input.yaml
  • plugins/utils/.claude-plugin/plugin.json
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-003-categorize-question/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-001-duplicate-bot-reply/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-012-ci-push-override/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-014-response-imperative-change/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/reference-analysis.json
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-013-response-question-no-change/input.yaml
  • plugins/utils/commands/address-reviews.md
  • plugins/utils/evals/cases/address-reviews/case-009-prioritize-mixed/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-010-filter-coderabbit-kept/annotations.yaml
  • docs/index.html
  • plugins/utils/evals/cases/address-reviews/case-002-duplicate-extreme/input.yaml
  • .claude-plugin/marketplace.json
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-007-categorize-suggestion/annotations.yaml
🚧 Files skipped from review as they are similar to previous changes (8)
  • plugins/utils/evals/cases/address-reviews/case-004-categorize-question-2/annotations.yaml
  • evals/budget.yaml
  • plugins/utils/evals/cases/address-reviews/case-005-categorize-change-request/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-006-categorize-dead-code/input.yaml
  • plugins/utils/evals/cases/address-reviews/case-011-reply-format/annotations.yaml
  • plugins/utils/evals/cases/address-reviews/case-008-categorize-action-instruction/input.yaml
  • plugins/utils/scripts/check_replied.py
  • plugins/utils/evals/eval-address-reviews.yaml

@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch 8 times, most recently from 141add0 to b626d33 Compare June 22, 2026 14:06
@bryan-cox

Copy link
Copy Markdown
Contributor Author

/test eval-address-reviews

@bryan-cox

Copy link
Copy Markdown
Contributor Author

/test eval-address-reviews

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 23, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch from b626d33 to c750e30 Compare June 23, 2026 09:58
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 23, 2026
@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch from c750e30 to e90f2bb Compare June 23, 2026 10:01
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bryan-cox bryan-cox force-pushed the fix/address-reviews-dedup-and-evals branch from e90f2bb to d26010c Compare June 23, 2026 10:03
@bryan-cox

Copy link
Copy Markdown
Contributor Author

/test eval-address-reviews

@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants