Detect upstream Tekton changes that break OpenShift Pipelines — before they reach the midstream.
streamstress is a Rust CLI that builds upstream Tekton components from source, swaps the resulting images into a live OpenShift Pipelines operator deployment, and runs the release-tests suite to catch regressions early.
The tool automates a workflow that would otherwise require many manual steps:
-
Auto-Setup — Enables the OCP internal image registry route, installs the OpenShift Pipelines operator via OLM if missing, creates the target namespace with image-pull RBAC. All idempotent — safe to run on already-configured clusters.
-
Build — Shallow-clones upstream Tekton repos, builds container images using
ko(ordockerfor console-plugin), and pushes them to the OCP internal registry under thetekton-upstreamnamespace. Multiple components build in parallel via tokio. -
Deploy — Finds the OpenShift Pipelines operator's ClusterServiceVersion (CSV), patches
IMAGE_*environment variables to point at the upstream-built images (using the internalimage-registry.openshift-image-registry.svc:5000address), then deletes the relevant TektonInstallerSets to force the operator to re-reconcile with the new images. Waits for TektonConfig Ready condition and verifies pod images match. -
Test — Clones the
release-testsrepo, runs Gauge specs with streaming output (piped to both terminal and log files), parses results from JUnit XML (or falls back to Gauge stdout parsing when XML is unavailable), and categorizes failures. -
Report — Writes structured JSON results with per-test pass/fail, duration, error messages, and failure categorization. Optionally includes per-spec resource profiles (CPU, memory, pod count) for parallelism planning.
-
Publish — Pushes results to a
gh-pagesbranch with a self-contained HTML dashboard for trend analysis across runs.
OLM (Operator Lifecycle Manager) owns the operator Deployments. Direct Deployment patches get reverted. The correct approach is to patch the CSV, which OLM then propagates to Deployments. After patching, InstallerSets are deleted to force the operator controller to re-create them with the new image references.
Pods running in the cluster can't authenticate to the external registry route. The IMAGE_* env vars use the internal service address (svc:5000) so that operator-managed pods can pull the upstream-built images without extra auth configuration.
| Component | Upstream Repo | Build | InstallerSet Prefix |
|---|---|---|---|
| pipeline | tektoncd/pipeline | ko | pipeline |
| triggers | tektoncd/triggers | ko | trigger |
| chains | tektoncd/chains | ko | chain |
| results | tektoncd/results | ko | result |
| manual-approval-gate | openshift-pipelines/manual-approval-gate | ko | manualapprovalgate |
| console-plugin | openshift-pipelines/console-plugin | docker | tekton-config-console-plugin-manifests |
Component configuration lives in config/components.toml — each entry maps upstream repo URLs, ko import paths, and IMAGE_* env var names used by the operator.
- An OpenShift 4.x cluster with cluster-admin access
oc,ko,git,go,gauge(with go and xml-report plugins)- Rust toolchain (for building the CLI)
The CLI auto-enables the registry route and installs the OpenShift Pipelines operator if missing. Pass
--no-auto-setupto skip this.
# Build the CLI
cargo build --release
# Check prerequisites and cluster connectivity
streamstress check
# Full build → deploy → test for one component
streamstress run --components pipeline
# All components
streamstress run --components pipeline,triggers,chains,results,manual-approval-gate
# Pin specific git refs (branch, tag, PR, or commit)
streamstress run --components "pipeline:v0.62.0,triggers:pr/123"
# Dry run — resolve refs, show plan, don't execute
streamstress run --components pipeline,triggers --dry-run
streamstress run --components pipeline --dry-run --json
# With resource profiling (metrics-server required)
streamstress run --components pipeline --profile
# Individual stages
streamstress build --component pipeline
streamstress deploy --component pipeline --registry <registry-route>/tekton-upstream
streamstress test --release-tests-ref master
# Re-analyze past results
streamstress results --output-dir ./test-output
# In-cluster Job management
streamstress status
streamstress logs
streamstress logs --job streamstress-1706900000
# Publish results to dashboard
streamstress publish --label "upstream pipeline @ main"| Command | Description |
|---|---|
check |
Verify tool prerequisites (oc, ko, git, go), cluster auth, operator, registry. Shows [auto-fixable] for items that auto-setup can resolve. |
build |
Clone upstream repo, build images with ko/docker, push to OCP internal registry. |
deploy |
Patch operator CSV with upstream image refs, delete InstallerSets, wait for reconciliation. |
test |
Clone release-tests, run Gauge specs, parse JUnit XML or stdout, categorize failures. |
run |
Full orchestration: auto-setup → parallel builds → in-cluster Job for deploy+test. |
results |
Offline re-analysis of a previous test run's output directory. |
status |
List streamstress Jobs in the cluster with status and age. |
logs |
Stream logs from the most recent (or named) Job pod. |
publish |
Push results JSON + dashboard assets to gh-pages orphan branch. |
local machine cluster
───────────── ───────
auto-setup ──────────────────────────► registry route, operator, RBAC
parallel ko builds ──────────────────► push images to internal registry
create Job (--skip-build) ───────────► Job deploys + tests
status / logs ◄────────────────────── stream results back
The run subcommand builds images locally (parallel via tokio JoinSet), then creates a Kubernetes Job that runs the deploy+test phases in-cluster. The Job uses a cached CLI container image (rebuilt only on version bumps). Use status and logs to monitor.
Run build, deploy, test separately for full local control.
With --profile, the CLI collects per-spec resource usage via the metrics-server API:
- Polls
PodMetricsduring test execution - Detects spec boundaries in Gauge output
- Computes min/max/avg/p95 CPU and memory per spec
- Calculates maximum safe parallelism:
(allocatable - baseline) × 80% / peak_per_spec - Reports limiting resource (CPU vs memory) and reasoning
Output written to test-output/results/resource-profile.json.
Test failures are automatically categorized by keyword matching:
| Category | Trigger | Meaning |
|---|---|---|
| Missing Optional Components | "chains", "manual-approval", "knative" | Test needs a component not deployed |
| Upgrade Prerequisites | "upgrade" + "namespace"/"setup" | Test assumes an upgrade path |
| Upstream Regression | (default) | Likely upstream code change |
| Platform Issue | "uid_map", "buildah+namespace" | Cluster/infra problem |
| Configuration Gap | missing secrets/auth | Missing setup or RBAC |
GitHub Actions workflow at .github/workflows/streamstress-run.yml:
gh workflow run streamstress-run.yml \
-f cluster_api_url="https://api.cluster.example.com:6443" \
-f cluster_password="..." \
-f components="pipeline,triggers,chains,results,manual-approval-gate" \
-f release_tests_ref="master" \
-f cli_flags="--profile"The workflow installs all tools, builds the CLI, authenticates to the cluster and registry, runs the full cycle, and publishes results to GitHub Pages.
Published via streamstress publish, the dashboard provides:
- D3.js trend charts (pass/fail rates across runs)
- Per-test expandable result tables
- Failure categorization breakdown
- Run comparison and regression highlighting
- Reactive filters with URL state persistence
- Resource usage overlay (when profiling data available)
View at https://<org>.github.io/<repo>/ after publishing.
| Code | Meaning |
|---|---|
| 0 | All tests passed |
| 1 | Some tests failed |
| 2 | Build or infrastructure error |
| 3 | Both build error and test failures |
Apache-2.0
