WIP: feat(webui): AI assistant, etcd metric charts, and UX enhancements#213
WIP: feat(webui): AI assistant, etcd metric charts, and UX enhancements#213mtulio wants to merge 22 commits into
Conversation
Added controller, collector, and must-gather-monitoring image references to the configVersion ConfigMap so all plugin images are visible in the cluster configuration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t.html The openExternalTab() and appAddress() functions split the URL on '/opct-report.html' to derive the base URL, but the report is saved and served as 'index.html'. This caused the Metrics tab to navigate to an invalid path. Use a regex to strip the last path segment instead, which works regardless of the page filename. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Embed etcd Prometheus metric charts inline on the etcd page using Chart.js, replacing the previous Plotly-based approach for lighter rendering (~200KB vs ~3.5MB). Layout: - Split-pane with draggable divider (70% tables / 30% charts default) - Independent scroll on each panel - v-if/v-else ensures non-etcd pages use full-width layout unchanged Chart features: - Drag-to-select time range zoom on X axis (chartjs-plugin-zoom) - Reset zoom button per chart - Expand button opens chart in fullscreen overlay with zoom support - Responsive sizing, auto-resize on panel drag Charts displayed: - etcd fsync WAL/DB duration p99 - etcd peer round trip time - etcd total leader elections - etcd request duration p99 Data source: reuses existing Plotly JSON files already generated at metrics/ directory during report processing — no backend changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added Web UI Report Development section to CLAUDE.md covering: - Architecture (template files, CDN libraries, rendering pattern) - Testing workflow (build, regenerate, serve, verify) - Split-pane layout pattern with v-if/v-else rule - Chart.js integration guide and common pitfalls Added .claude/skills/webui-report-test skill for the build-regenerate-serve workflow when modifying report templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a floating chat widget to the OPCT web UI that communicates with Claude via the Anthropic SDK, providing interactive report analysis directly in the browser. Backend (internal/chat/): - handler.go: HTTP handlers with SSE streaming, Vertex AI and Anthropic API auto-detection, tool-use loop - tools.go: 8 tool definitions (report summary, checks, plugin results, test failure logs, etcd metrics, network data) - session.go: Chat session persistence as JSON files - prompt.go: Built-in system prompt with file override support Frontend: - Floating chat widget with minimize/maximize/close buttons - Markdown rendering via marked.js, SSE streaming display - Stop button to cancel mid-stream responses - Session auto-save and load dropdown - Quick-start suggestion buttons - Styled headline badges replacing raw text brackets Integration: - Chat API routes registered in pkg/cmd/report/report.go - Added anthropic-sdk-go, jsonschema, and vertex dependencies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLAUDE.md: - Added chatbot architecture section (backend files, API endpoints, auth detection, Vertex region/model gotchas) - Added floating widget pattern with anti-patterns learned - Added UI styling patterns (headline badges, font conventions) Skills: - Updated webui-report-test with Go server workflow for chat testing - Added webui-chat-debug skill for API/auth/tool troubleshooting Agent: - Added webui-developer agent with full architecture overview, critical rules, file reference, CDN inventory, and how-to guides Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedIgnore keyword(s) in the title. ⛔ Ignored keywords (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR integrates a floating AI chat assistant into the OPCT web report UI. It adds backend session persistence, tool definitions for report analysis, SSE-streaming HTTP handlers, and a Vue.js chat widget with markdown rendering. It also adds Chart.js metrics visualization for etcd pages, a resizable split-pane layout, and comprehensive developer documentation. ChangesChat Assistant Backend & Frontend
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/skills/webui-chat-debug/SKILL.md:
- Around line 69-70: Update the Vertex troubleshooting docs to match the actual
provider auto-detection logic: replace the instruction that tells users to set
CLAUDE_CODE_USE_VERTEX=1 with guidance that the app detects Vertex AI when both
environment variables GOOGLE_CLOUD_LOCATION and ANTHROPIC_VERTEX_PROJECT_ID are
present (and otherwise falls back to checking ANTHROPIC_API_KEY), and advise
using GOOGLE_CLOUD_LOCATION plus ANTHROPIC_VERTEX_PROJECT_ID (or
ANTHROPIC_API_KEY) when troubleshooting the `404 Not Found`/model ID issues
referenced in SKILL.md.
In `@CLAUDE.md`:
- Around line 1338-1350: Update the documentation block to match the enforced
startup detection and region selection: state that chat provider detection
checks for Vertex AI by verifying GOOGLE_CLOUD_LOCATION and
ANTHROPIC_VERTEX_PROJECT_ID (and only uses Vertex when CLAUDE_CODE_USE_VERTEX=1
is enabled), otherwise it falls back to ANTHROPIC_API_KEY for Anthropic API;
clarify region priority to use GOOGLE_CLOUD_LOCATION first (not CLOUD_ML_REGION)
and note that CLOUD_ML_REGION=global is not a valid Vertex/Anthropic endpoint so
it must not be treated as a fallback.
In `@data/templates/report/report.css`:
- Around line 253-254: Remove the deprecated CSS property 'word-wrap:
break-word;' from the ruleset and keep only 'overflow-wrap: break-word;' so the
stylesheet no longer triggers stylelint and avoids redundancy; locate the block
that contains both 'word-wrap' and 'overflow-wrap' and delete the 'word-wrap'
declaration, leaving 'overflow-wrap' unchanged.
In `@go.mod`:
- Line 123: Update the vulnerable module google.golang.org/grpc to v1.79.3 or
later in go.mod and ensure the change is applied to the build by running a
module update (e.g., go get google.golang.org/grpc@v1.79.3) and go mod tidy;
also scan and, if needed, bump transitive modules referencing grpc (notably
google.golang.org/api, cloud.google.com/go/auth, anthropic-sdk-go) so they pull
compatible grpc versions, then run your test suite or CI to verify nothing
breaks.
In `@internal/chat/handler.go`:
- Around line 64-98: NewHandler duplicates provider-detection logic and uses
disallowed fallback env vars; update NewHandler so that when cfg.Provider ==
"vertex" it reads region only from GOOGLE_CLOUD_LOCATION (remove CLOUD_ML_REGION
fallback) and reads projectID only from ANTHROPIC_VERTEX_PROJECT_ID (remove
GOOGLE_CLOUD_PROJECT fallback), and for the "anthropic" branch continue to use
ANTHROPIC_API_KEY; locate and edit the region and projectID assignments in
NewHandler to remove the fallback checks and ensure the
anthropic.NewClient/vertex.WithGoogleAuth calls use the cleaned variables.
- Around line 33-62: DetectConfig currently uses CLAUDE_CODE_USE_VERTEX and
falls back to CLOUD_ML_REGION, which violates guidelines; update DetectConfig to
only enable Vertex when GOOGLE_CLOUD_LOCATION is non-empty and
ANTHROPIC_VERTEX_PROJECT_ID (or GOOGLE_CLOUD_PROJECT fallback if you keep that
behavior) is non-empty, remove any check of CLAUDE_CODE_USE_VERTEX and remove
the CLOUD_ML_REGION fallback so the region comes exclusively from
GOOGLE_CLOUD_LOCATION, and otherwise enable Anthropic when ANTHROPIC_API_KEY is
set (preserving cfg.Enabled, cfg.Provider, and defaultModel behavior).
- Around line 228-232: sendSSE currently ignores the error returned by
fmt.Fprintf; change its signature from sendSSE(w http.ResponseWriter, flusher
http.Flusher, event, data string) to return an error (e.g., sendSSE(...) error),
capture the error from fmt.Fprintf, and if non-nil return it so callers can
handle/log the failure; only call flusher.Flush() after a successful write and
update callers (where sendSSE is invoked) to handle or log the returned error.
- Around line 107-114: The handler currently ignores the error returned by
json.NewEncoder(w).Encode in Handler.handleStatus; change the call to capture
the error (err := json.NewEncoder(w).Encode(...)) and handle it—on error write
an HTTP 500 response (e.g., http.Error(w, "failed to encode status",
http.StatusInternalServerError)) and log the error if a logger is available
(e.g., h.logger or similar); keep the Content-Type header as-is and avoid
writing additional headers after writing the error response.
- Around line 234-266: The handler handleSessions currently ignores errors
returned by json.NewEncoder(w).Encode in both the GET path (encoding sessions)
and the POST path (encoding the id map); update handleSessions so you capture
the error from json.NewEncoder(w).Encode(...) calls, and if non-nil call
http.Error(w, err.Error(), http.StatusInternalServerError) and return to stop
further execution; ensure you reference the same Encoder usage in the GET branch
where sessions variable is encoded and the POST branch where
map[string]string{"id": session.ID} is encoded so both failure paths are
handled.
In `@internal/chat/tools.go`:
- Around line 262-293: The directory matching in ToolExecutor.readTestFile is
too permissive because it uses strings.Contains(entry.Name(), pluginPrefix);
change it to a precise match by validating the segment after the "failures-"
prefix equals pluginPrefix (e.g., split entry.Name() after "failures-" and
compare the first segment to pluginPrefix) or use a strict pattern (e.g., regex
"^failures-"+pluginPrefix+"(?:-|$)") so "10" won't match "100"; update the check
that currently references entry.Name() to use this precise comparison before
constructing filePath.
- Around line 14-24: The GenerateSchema function currently returns
anthropic.ToolInputSchemaParam without setting the required Type field; update
GenerateSchema[T any]() to set Type: anthropic.ToolInputSchemaTypeObject
alongside Properties (i.e., return anthropic.ToolInputSchemaParam{Type:
anthropic.ToolInputSchemaTypeObject, Properties: schema.Properties}), and also
update any hardcoded ToolInputSchemaParam instances in ToolDefinitions() to
include Type: anthropic.ToolInputSchemaTypeObject so all schema params satisfy
the Anthropic SDK requirement.
In `@pkg/run/run.go`:
- Around line 474-481: configVersion is mixing runtime-resolved r.sonobuoyImage
with compile-time pkg.* constants so metadata can be wrong when
--image-repository or per-image overrides are used; replace the
pkg.ControllerImage, pkg.PluginsImage, pkg.CollectorImage, and
pkg.MustGatherMonitoringImage entries with the same runtime-resolved image
values you use when creating the actual manifests (i.e., use the resolver
methods/fields on the run object instead of pkg.* — for example use the
controller image lookup used elsewhere on r, the plugins/collector/must-gather
monitoring image values derived at runtime), so all entries (r.sonobuoyImage and
the controller/plugins/collector/must-gather images) reflect the actual images
after repository/override resolution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 282a3e0a-0035-4e3c-af60-f4f88940f618
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (13)
.claude/agents/webui-developer.md.claude/skills/webui-chat-debug/SKILL.md.claude/skills/webui-report-test/SKILL.mdCLAUDE.mddata/templates/report/report.cssdata/templates/report/report.htmlgo.modinternal/chat/handler.gointernal/chat/prompt.gointernal/chat/session.gointernal/chat/tools.gopkg/cmd/report/report.gopkg/run/run.go
| golang.org/x/time v0.9.0 // indirect | ||
| google.golang.org/api v0.189.0 // indirect | ||
| google.golang.org/genproto/googleapis/rpc v0.0.0-20240722135656-d784300faade // indirect | ||
| google.golang.org/grpc v1.64.1 // indirect |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Effective grpc version in go.mod =="
rg -n 'google.golang.org/grpc\s+v' go.mod
echo
echo "== Why grpc is in the graph (direct/transitive parents) =="
go mod graph | rg 'google.golang.org/grpc@|@.*\s+google.golang.org/grpc@' | head -n 80
echo
echo "== Known vulnerabilities for current module graph =="
govulncheck ./... || trueRepository: redhat-openshift-ecosystem/opct
Length of output: 6408
🌐 Web query:
google golang org grpc v1.64.1 critical auth bypass vulnerability advisory
💡 Result:
gRPC-Go (google.golang.org/grpc) has a critical authorization-bypass vulnerability, tracked as CVE-2026-33186 / GHSA-p77j-4mvh-x3m3, affecting versions prior to v1.79.3 (so it does not specifically tie to v1.64.1; the affected range is <1.79.3). [1][2] What’s the bug (auth bypass) - The issue is an “Authorization bypass via missing leading slash in:path” caused by improper input validation of the HTTP/2:path pseudo-header. [1] - The gRPC-Go server was too lenient and accepted requests where the:path omitted the mandatory leading slash (for example, “Service/Method” instead of “/Service/Method”). [1][3] - Although the server could route such requests to the intended handler, authorization interceptors (including the official grpc/authz RBAC implementation in google.golang.org/grpc/authz) evaluated the raw non-canonical path string. [1][3] - As a result, authorization “deny” rules written for canonical paths (starting with “/”) failed to match, allowing bypass if the policy had a fallback “allow” rule. [1][3] Who is impacted / conditions - Users employing path-based authorization (notably grpc/authz, or custom interceptors relying on info.FullMethod or grpc.Method(ctx)). [1][3] - Their authorization policy includes specific deny rules for canonical paths but allows other requests by default (fallback allow). [1][3] - Exploitation requires the attacker to send raw HTTP/2 frames with malformed:path directly to the gRPC server. [3][1] Fix / recommended remediation - Upgrade gRPC-Go to v1.79.3 or newer (or latest master). [1] - The fix makes gRPC-Go reject any request whose:path does not start with a leading slash by returning codes.Unimplemented before it reaches authorization logic. [1][3] If you cannot upgrade immediately (mitigations) - Use a validating interceptor (recommended), apply infrastructure-level normalization/rejection of malformed HTTP/2:path, and/or harden policies to avoid relying on fallback allow rules. [3][4] Note: google.golang.org/grpc v1.64.1 is much older than the fixed version; the “auth bypass” advisory above describes affected versions as “< v1.79.3”. [1]
Citations:
- 1: GHSA-p77j-4mvh-x3m3
- 2: https://nvd.nist.gov/vuln/detail/CVE-2026-33186
- 3: https://www.tenable.com/cve/CVE-2026-33186
- 4: https://www.yazoul.net/advisory/cve/cve-2026-33186-grpc-go-authorization-bypass-patch-critical-flaw/
🏁 Script executed:
# Check if there are any explicit version constraints in go.mod that pin grpc version
cat -n go.mod | grep -A 5 -B 5 'google.golang.org/grpc'Repository: redhat-openshift-ecosystem/opct
Length of output: 1335
🏁 Script executed:
# Check go.mod for all constraints and toolchain directive
head -30 go.modRepository: redhat-openshift-ecosystem/opct
Length of output: 1052
Upgrade google.golang.org/grpc to v1.79.3 or newer to patch CVE-2026-33186.
google.golang.org/grpc v1.64.1 is affected by a critical authorization-bypass vulnerability (CVE-2026-33186 / GHSA-p77j-4mvh-x3m3) allowing path-based authorization rules to be circumvented. Check if transitive dependencies (google.golang.org/api, cloud.google.com/go/auth, anthropic-sdk-go) can be updated, or evaluate if direct constraints are needed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@go.mod` at line 123, Update the vulnerable module google.golang.org/grpc to
v1.79.3 or later in go.mod and ensure the change is applied to the build by
running a module update (e.g., go get google.golang.org/grpc@v1.79.3) and go mod
tidy; also scan and, if needed, bump transitive modules referencing grpc
(notably google.golang.org/api, cloud.google.com/go/auth, anthropic-sdk-go) so
they pull compatible grpc versions, then run your test suite or CI to verify
nothing breaks.
There was a problem hiding this comment.
This is a transitive dependency from the Anthropic SDK and Google Cloud libraries. The fix (grpc v1.79.3+) needs to come from upstream SDK updates. Not actionable in this PR.
There was a problem hiding this comment.
@mtulio — understood, and that's a fair assessment. To add some additional context:
CVE-2026-33186 specifically affects gRPC-Go servers that (1) use path-based authorization interceptors such as the official RBAC implementation in google.golang.org/grpc/authz, and (2) have a security policy with specific "deny" rules for canonical paths but a fallback "allow" rule.
Since OPCT uses google.golang.org/grpc purely as a transitive client-side transport (pulled in by google.golang.org/api, cloud.google.com/go/auth, and anthropic-sdk-go for GCP/Vertex AI communication), and does not expose a gRPC server with authorization interceptors, the vulnerability is not exploitable in this codebase.
Worth noting: Go's MVS does technically allow overriding a transitive dependency by adding a direct require google.golang.org/grpc v1.79.3 line to go.mod without needing upstream SDK changes. However, given the nil exploitability risk here and the potential for API compatibility issues between google.golang.org/api v0.189.0 and a newer grpc version, waiting for upstreams to bump naturally is the right call.
I'll track this as a known transitive dependency issue for awareness when upstream SDKs catch up.
📝 Learnings were identified and not saved (knowledge base disabled). Enable
| func GenerateSchema[T any]() anthropic.ToolInputSchemaParam { | ||
| reflector := jsonschema.Reflector{ | ||
| AllowAdditionalProperties: false, | ||
| DoNotReference: true, | ||
| } | ||
| var v T | ||
| schema := reflector.Reflect(v) | ||
| return anthropic.ToolInputSchemaParam{ | ||
| Properties: schema.Properties, | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the ToolInputSchemaParam definition in the Anthropic SDK
rg -A10 'type ToolInputSchemaParam' --type=goRepository: redhat-openshift-ecosystem/opct
Length of output: 57
🏁 Script executed:
#!/bin/bash
# First, check the imports and context in tools.go
head -40 internal/chat/tools.goRepository: redhat-openshift-ecosystem/opct
Length of output: 1418
🏁 Script executed:
#!/bin/bash
# Search for where ToolInputSchemaParam is used in the codebase
rg 'ToolInputSchemaParam' --type=goRepository: redhat-openshift-ecosystem/opct
Length of output: 507
🏁 Script executed:
#!/bin/bash
# Search for other tool schema patterns or definitions
rg 'Tool.*Schema' --type=go -A3Repository: redhat-openshift-ecosystem/opct
Length of output: 1293
🏁 Script executed:
#!/bin/bash
# Look for the struct definition - check vendor or go.mod
fd -type f -name "*.go" | xargs grep -l "ToolInputSchemaParam" | head -5Repository: redhat-openshift-ecosystem/opct
Length of output: 246
🏁 Script executed:
#!/bin/bash
# Check if there's a go.mod or go.sum to understand the Anthropic SDK version
if [ -f go.mod ]; then
grep -A2 "anthropic" go.mod
fiRepository: redhat-openshift-ecosystem/opct
Length of output: 239
🏁 Script executed:
#!/bin/bash
# Search for any test files related to tools or schemas
fd -type f -name "*test*.go" internal/chat/Repository: redhat-openshift-ecosystem/opct
Length of output: 246
🏁 Script executed:
#!/bin/bash
# Look for any comments or documentation about the ToolInputSchemaParam structure
rg -B5 "InputSchema.*ToolInputSchemaParam" --type=goRepository: redhat-openshift-ecosystem/opct
Length of output: 1783
🌐 Web query:
Anthropic SDK Go ToolInputSchemaParam schema fields required
💡 Result:
In the Anthropic Go SDK, ToolInputSchemaParam’s schema-required fields are: 1) type (required): ToolInputSchemaParam has a field Type tagged as json:"type,required".[1] 2) properties (not marked required in the struct): ToolInputSchemaParam also includes Properties tagged as json:"properties" (no ,required tag).[1] Additional note: the SDK example for tools defines the custom tool input schema via BetaToolInputSchemaParam{ Properties:..., } without specifying a Required list on that struct.[2] So, from the Go SDK type definition, only type is required; properties is optional in the struct (though your schema will often include it in practice).[1]
Citations:
- 1: anthropics/anthropic-sdk-go@79e4d75
- 2: https://github.com/anthropics/anthropic-sdk-go/blob/main/tools.md
🌐 Web query:
anthropic-sdk-go v1.43.0 ToolInputSchemaParam struct definition
💡 Result:
In anthropic-sdk-go v1.43.0, the ToolInputSchemaParam struct is defined as: type ToolInputSchemaParam struct { Type param.Field[ToolInputSchemaType] json:"type,required" Properties param.Field[interface{}] json:"properties" ExtraFields map[string]interface{} json:"-,extras" } func (r ToolInputSchemaParam) MarshalJSON (data []byte, err error) { return apijson.MarshalRoot(r) } And the only known ToolInputSchemaType is: type ToolInputSchemaType string const ( ToolInputSchemaTypeObject ToolInputSchemaType = "object") func (r ToolInputSchemaType) IsKnown bool { switch r { case ToolInputSchemaTypeObject: return true } return false } Notes: - The Type field is required and is serialized with json:"type,required". - Properties is serialized under properties. - ExtraFields is not serialized directly (tag json:"-,extras") and is used for additional/unmodeled schema fields. These definitions are visible in the SDK source (ToolInputSchemaParam and ToolInputSchemaTypeObject). [1]
Citations:
Set the required Type field in ToolInputSchemaParam to "object".
The Type field is required by the Anthropic SDK (v1.43.0). The GenerateSchema function currently omits it, returning only the Properties field. Update the function to include Type: anthropic.ToolInputSchemaTypeObject:
Suggested fix
func GenerateSchema[T any]() anthropic.ToolInputSchemaParam {
reflector := jsonschema.Reflector{
AllowAdditionalProperties: false,
DoNotReference: true,
}
var v T
schema := reflector.Reflect(v)
return anthropic.ToolInputSchemaParam{
Type: anthropic.ToolInputSchemaTypeObject,
Properties: schema.Properties,
}
}Also update the hardcoded ToolInputSchemaParam instances in ToolDefinitions() to set the Type field.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/chat/tools.go` around lines 14 - 24, The GenerateSchema function
currently returns anthropic.ToolInputSchemaParam without setting the required
Type field; update GenerateSchema[T any]() to set Type:
anthropic.ToolInputSchemaTypeObject alongside Properties (i.e., return
anthropic.ToolInputSchemaParam{Type: anthropic.ToolInputSchemaTypeObject,
Properties: schema.Properties}), and also update any hardcoded
ToolInputSchemaParam instances in ToolDefinitions() to include Type:
anthropic.ToolInputSchemaTypeObject so all schema params satisfy the Anthropic
SDK requirement.
There was a problem hiding this comment.
The current usage follows the official Anthropic SDK examples and compiles correctly with the invopop/jsonschema reflector. The ToolInputSchemaParam struct with Properties field works as documented in the SDK's tool-calling example.
There was a problem hiding this comment.
@mtulio, you're right — I was mistaken. The official Anthropic SDK's own test code constructs ToolInputSchemaParam with only Properties and Required fields, without setting Type, which confirms your usage is consistent with the SDK's documented patterns. The SDK's README also demonstrates using invopop/jsonschema alongside ToolInputSchemaParam in exactly this way.
I'll withdraw this comment.
(ㆆ _ ㆆ)
| "cli-version": version.Version.Version, | ||
| "cli-commit": version.Version.Commit, | ||
| "sonobuoy-version": buildinfo.Version, | ||
| "sonobuoy-image": r.sonobuoyImage, | ||
| "controller-image": pkg.ControllerImage, | ||
| "plugin-image-tests": pkg.PluginsImage, | ||
| "plugin-image-collector": pkg.CollectorImage, | ||
| "plugin-image-must-gather-mon": pkg.MustGatherMonitoringImage, |
There was a problem hiding this comment.
Use runtime-resolved image values in configVersion to avoid incorrect metadata.
configVersion currently mixes runtime (r.sonobuoyImage) and compile-time constants (pkg.*). If --image-repository or per-image overrides are used, these keys will report wrong images.
🔧 Proposed fix
Data: map[string]string{
"cli-version": version.Version.Version,
"cli-commit": version.Version.Commit,
"sonobuoy-version": buildinfo.Version,
"sonobuoy-image": r.sonobuoyImage,
"controller-image": pkg.ControllerImage,
- "plugin-image-tests": pkg.PluginsImage,
- "plugin-image-collector": pkg.CollectorImage,
- "plugin-image-must-gather-mon": pkg.MustGatherMonitoringImage,
+ "plugin-image-tests": r.PluginsImage,
+ "plugin-image-collector": r.CollectorImage,
+ "plugin-image-must-gather-mon": r.MustGatherMonitoringImage,
},
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/run/run.go` around lines 474 - 481, configVersion is mixing
runtime-resolved r.sonobuoyImage with compile-time pkg.* constants so metadata
can be wrong when --image-repository or per-image overrides are used; replace
the pkg.ControllerImage, pkg.PluginsImage, pkg.CollectorImage, and
pkg.MustGatherMonitoringImage entries with the same runtime-resolved image
values you use when creating the actual manifests (i.e., use the resolver
methods/fields on the run object instead of pkg.* — for example use the
controller image lookup used elsewhere on r, the plugins/collector/must-gather
monitoring image values derived at runtime), so all entries (r.sonobuoyImage and
the controller/plugins/collector/must-gather images) reflect the actual images
after repository/override resolution.
There was a problem hiding this comment.
Good catch. The configVersion ConfigMap is intended to record the default image versions shipped with the binary, not the runtime-resolved ones. The runtime overrides (--image-repository) are a dev/disconnected feature — recording the defaults is correct for tracking which OPCT version generated the results. The sonobuoy-image using r.sonobuoyImage is the exception since it was already there before this change. Will consider aligning in a follow-up if needed.
There was a problem hiding this comment.
@mtulio, thanks for clarifying the design intent — that makes sense. Recording the default/shipped image versions in configVersion provides a stable audit trail tied to the OPCT binary version, regardless of deployment-time overrides. The sonobuoy-image inconsistency (using r.sonobuoyImage) is noted as a pre-existing exception worth aligning in a follow-up.
(ᐢ•ᴗ•ᐢ)
📝 Learnings were identified and not saved (knowledge base disabled). Enable
- Simplify Vertex AI detection: remove CLAUDE_CODE_USE_VERTEX gate, detect directly from GOOGLE_CLOUD_LOCATION + ANTHROPIC_VERTEX_PROJECT_ID - Extract vertexRegion/vertexProjectID helpers to eliminate duplicated env var resolution between DetectConfig and NewHandler - Add error checks on json.Encode and fmt.Fprintf (errcheck linter) - Remove deprecated word-wrap CSS property (keep overflow-wrap) - Fix readTestFile directory matching to use HasPrefix with full pattern instead of Contains to avoid false positives - Update CLAUDE.md and webui-chat-debug skill to match detection logic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addressed CodeRabbit review feedbackPushed commit Fixed
Not addressed (intentional)
🤖 Generated with Claude Code |
🤖 Generated by Claude Code — AI assistant |
…skill Establish mandatory AI sign-off for all GitHub interactions (commits, PR descriptions, comments, review replies) to standardize AI SDLC traceability in the repository. New agents: - opct-developer: general project context, conventions, key files - pr-reviewer: review checklist for code, web UI, chat, and security Updated: - CLAUDE.md: expanded AI footer into sign-off standard with formats for commits and GitHub interactions - webui-developer agent: added sign-off reference New skill: - go-validate: standard build/test/vet checklist before committing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Simplify AI sign-off to minimal format: - Commits: Co-Authored-By trailer only, no emoji footer - GitHub comments: end with "— AI Claude" Updated CLAUDE.md, all agents (opct-developer, pr-reviewer, webui-developer) to use the simplified format. Co-Authored-By: Claude <noreply@anthropic.com>
Agent (.claude/agents/ci-triage.md): - Full triage decision tree: parse job metadata, analyze failures, check job history, query flake rates, check existing Jira bugs - Job naming pattern rules for both OPCT repo and release repo jobs - Jira bug template with required fields (OCPBUGS, OPCT-400 parent, splatteam label, release blocker, affects version) - Draft-then-approve workflow for bug filing Skill (.claude/skills/ci-triage/SKILL.md): - Entry point for "triage <PROW_URL>" invocations - Documents marketplace skills used and expected output format - Example URLs for all 4 job naming patterns Co-Authored-By: Claude <noreply@anthropic.com>
Add prerequisites section documenting: - Jira MCP server setup (uvx mcp-atlassian) - Fallback workflow when MCP is not configured - mcp__atlassian__jira_create_issue call template with correct OCPBUGS fields (versions, labels, security, parent link) Co-Authored-By: Claude <noreply@anthropic.com>
Fixes discovered while triaging OCPBUGS-85700 (4.22 vSphere upgrade): - Fix MCP tool name: mcp__atlassian__ → mcp__jira__ (was causing runtime failures when filing bugs) - Add Skill and mcp__jira__* to SKILL.md allowed-tools (marketplace skills could not be invoked without it) - Add needs-triage label to bug template - Add INFRA_FAILURE classification for infrastructure/provisioning failures (VM creation, bootstrap timeout, lease errors) - Document Jira field IDs: Release Blocker (customfield_10847), Component OPCT/Other (id 14860), Security Level - Fix parent link: use native parent field instead of customfield_10018 which does not work cross-project (OCPBUGS → OPCT-400) - Switch description template to Jira wiki markup format - Add REST API fallback with redhat.atlassian.net (issues.redhat.com 301-redirects and drops POST body) - Clarify flake threshold: "Sippy pass rate below 95%" instead of ambiguous "flake rate >5%" - Improve bug search: JQL via mcp__jira__jira_search as primary, ci:check-if-jira-regression-is-ongoing as secondary - Add cross-reference from SKILL.md to agent definition Co-Authored-By: Claude <noreply@anthropic.com>
OPCT CI job failures are not OCP product bugs — they are workflow failures that should be tracked in the OPCT project to avoid polluting OCP project owners' views. Changes: - Bug filing target: OCPBUGS → OPCT project - Parent OPCT-400 can now be set directly during creation (same project) - Removed OCPBUGS-specific fields: Release Blocker, Components, Affects Version, Security Level (not needed in OPCT project) - Existing bug search queries both projects (OPCT + OCPBUGS) to catch bugs filed under either project - Simplified MCP tool call (no separate parent update step needed) Existing bugs (OCPBUGS-85700, OCPBUGS-85823) remain unchanged. Co-Authored-By: Claude <noreply@anthropic.com>
Add standardized labels and version fields for CI triage bugs filed
in the OPCT project:
- openshift-{X.Y}: OCP version from job name (e.g., openshift-4.17)
- opct-{X.Y}: OPCT CLI version from build log (e.g., opct-0.6)
- fixVersions: opct-vX.Y.Z matching OPCT project versions
OPCT version is extracted from build log (OPCT CLI: vX.Y.Z or
quay.io/opct/opct:vX.Y.Z). Left blank if not found in the logs.
Co-Authored-By: Claude <noreply@anthropic.com>
Replace the generic "AI Assistant is not available" message with detailed setup instructions showing both authentication options: 1. Anthropic API direct (ANTHROPIC_API_KEY) with link to console 2. Google Cloud Vertex AI (GOOGLE_CLOUD_LOCATION + PROJECT) with link to Vertex AI Claude docs The message renders as markdown with code blocks for easy copy-paste of the required environment variables. Co-Authored-By: Claude <noreply@anthropic.com>
- Fix "filted" → "filtered" in Replay step filter header - Fix "Github" → "GitHub" with missing article in warning banner - Fix "disablied" → "disabled" in TODO comment - Fix "camgd" → "camgi" in CAMGI TODO note - Fix "ALl" → "All" in Tests iframe title - Fix "BaslineAPI" → "BaselineAPI" in comment - Fix malformed </> close tags → proper </p> (4 occurrences) - Fix mismatched </p></b> nesting → </b></p> - Fix double <p><p> tag in conformance plugin results - Normalize capitalization in error counter descriptions Co-Authored-By: Claude <noreply@anthropic.com>
When plugins finish out of expected order or crash before reporting
back to the aggregator server, the delta time computation produced
invalid values:
- Plugin 80 (openshift-tests-replay): empty time when it never
finishes (no PUT event in meta/run.log)
- Plugin 99 (artifacts-collector): 2562047h47m16s (math.MaxInt64 ns)
because its predecessor (plugin 80) never finished, causing
diffDate("", time) to subtract from zero time
Fix: replace hardcoded switch block with a predecessor lookup table
that falls back to the plugin's own start time when the predecessor
never finished. Also add frontend fallback to display "-" for plugins
with no runtime entry.
Fixes: OPCT-408
Co-Authored-By: Claude <noreply@anthropic.com>
The runtime counter in `opct status -w` displayed nanosecond precision (e.g., `6m13.274013714s`) which is noisy and hard to read. Truncate to whole seconds for cleaner output (e.g., `6m13s`). Co-Authored-By: Claude <noreply@anthropic.com>
New skills:
- jira-ops: shared Jira operations skill (MCP-first with REST API
fallback). Documents link type "Related" (not "Relates"), comment
and issue creation patterns. Replaces hardcoded REST API details
in ci-triage agent.
- opct-runtime: plugin runtime architecture reference (execution
order, dependency chain timing, sonobuoy annotation flow, key
files in plugins repo). Links to OPCT-409 for improvement tracking.
Updated agents:
- ci-triage: replaced inline MCP/REST sections with jira-ops skill
reference, added link type fix ("Related" not "Relates"), added
related skills section
- opct-developer: added related skills section referencing
opct-runtime, jira-ops, and ci-triage
Co-Authored-By: Claude <noreply@anthropic.com>
Summary
opct-report.htmlChanges
Plugin image references (
pkg/run/run.go)Add
controller-image,plugin-image-collector, andplugin-image-must-gather-monto theconfigVersionConfigMap so all plugin images are visible in the cluster configuration.Metrics tab fix (
data/templates/report/report.html)Replace hardcoded
split('/opct-report.html')with a regex to derive the base URL, fixing the Metrics tab which failed when the page was served asindex.html.etcd metric charts (
data/templates/report/)Embed Prometheus metric charts inline on the etcd page using Chart.js (~200KB, replacing Plotly ~3.5MB):
metrics/directory — no backend changesAI assistant chatbot (
internal/chat/,data/templates/report/)Floating chat widget powered by Claude via the Anthropic SDK:
<report-dir>/chat-sessions/UX improvements (
data/templates/report/)|> OpenShift[...] Kubernetes[...]textDocumentation (
.claude/,CLAUDE.md)webui-report-test, addedwebui-chat-debugwebui-developerwith architecture overview, critical rules, and how-to guidesTest plan
make buildsucceedsmake testpasses🤖 Generated with Claude Code