Skip to content

Complete certificate chain with root CA for external registries in UpdateService#535

Open
rporres wants to merge 2 commits into
mainfrom
531-again
Open

Complete certificate chain with root CA for external registries in UpdateService#535
rporres wants to merge 2 commits into
mainfrom
531-again

Conversation

@rporres

@rporres rporres commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

The previous TLS chain completion for resolve-quay-registry-ca relied on
scanning the LZ system trust store: O(n) openssl subprocess calls per CA
in the bundle, silent failure for custom/private CAs absent from the store,
and a result dependent on trust store state that may be stale in
long-running disconnected deployments.

Replace the system store lookup with an explicit CA from sslCACertificate,
passed via --certificates-config (reads from certificates.yaml) or --ca-pem
(inline PEM) on resolve-quay-registry-ca; the options are mutually
exclusive. The Ansible task that invokes the command is updated accordingly.

Add check-certificate-chains (enclave tools subcommand) to validate chain
completeness and CA consistency before installation:

  • chain ends with a non-self-signed certificate and sslCACertificate is
    absent → error
  • only a leaf certificate present and sslCACertificate is absent → error
  • chain ends with a self-signed root but sslCACertificate does not verify
    it → error (CA hierarchy mismatch)

Certificate utilities (openssl_verify, is_self_signed, pem_blocks) are
consolidated in cert_utils.py for reuse across tools. validations.sh gains
checkCACert() for PEM format and expiry checks. Documentation and example
config updated to clarify when sslCACertificate is required and how chain
validation works.

Summary by CodeRabbit

  • New Features

    • Added certificate-chain validation during installation to catch incomplete or inconsistent SSL certificate setups earlier.
    • Added a command to manually check certificate chains from a certificates config file.
    • Improved Quay registry CA resolution to work with an explicitly supplied CA certificate when needed.
  • Bug Fixes

    • Better handling of self-signed roots, missing intermediates, and chain verification failures.
    • Validation now reports clearer errors when certificate settings are missing, invalid, or mismatched.
  • Documentation

    • Updated certificate setup guides and examples with clearer guidance on when a CA certificate is required.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

Adds cert_utils.py with shared PEM extraction, openssl_verify, and is_self_signed utilities. Introduces check_certificate_chains.py for YAML-config-driven chain completeness validation and a new check-certificate-chains CLI subcommand. Refactors quay_registry_ca.py to use shared utilities and accept an explicit ca_pem. Hooks the new validator into bootstrap.sh, validations.sh, and the Ansible playbook, with updated documentation.

Changes

Certificate Validation Utilities and CLI Integration

Layer / File(s) Summary
Shared PEM utility module
src/enclave/tools/cert_utils.py, src/tests/test_cert_utils.py
New cert_utils.py adds pem_blocks() (regex PEM extractor), openssl_verify() (system-trust-store isolated verifier using temp files with -no-CAfile/-no-CApath), and is_self_signed() (RFC2253 issuer/subject comparator). Tests cover subprocess mocking, OSError/timeout paths, and real-openssl integration tests.
Chain completeness validator
src/enclave/tools/check_certificate_chains.py, src/tests/test_check_certificate_chains.py
New check_certificate_chains.py reads a YAML config, runs _check_chain against sslAPICertificateFullChain and sslIngressCertificateFullChain relative to sslCACertificate, raises CertificateValidationError on inconsistencies. Tests cover self-signed root, CA mismatch, leaf-only chains, aggregated multi-field errors, and invalid YAML/missing-file paths.
quay_registry_ca refactor
src/enclave/tools/quay_registry_ca.py, src/tests/test_quay_registry_ca.py
Removes local regex PEM helpers in favor of cert_utils imports. Replaces _chain_trust_anchor_pem with public chain_trust_anchor_pem(chain, ca_pem="") that drops system trust-store root completion, optionally verifies and appends an explicit CA PEM, and propagates ca_pem through resolve_registry_ca_pem and main. Test suite updated with new trust-anchor and resolution coverage.
CLI wiring
src/enclave/tools/cli.py, src/tests/test_tools_cli.py
Adds check-certificate-chains Click subcommand with required --config. Extends resolve-quay-registry-ca with mutually exclusive --ca-pem and --certificates-config (YAML parsing + whitespace validation), forwarding resolved ca_pem to quay_registry_ca_main. CLI tests cover all new options, validation errors, and CertificateValidationError propagation.
Shell/Ansible/docs integration
bootstrap.sh, validations.sh, playbooks/tasks/trust_quay_registry_ca_for_image_config.yaml, config/certificates.example.yaml, docs/CONFIGURATION_REFERENCE.md, docs/DEPLOYMENT_GUIDE.md
bootstrap.sh runs check-certificate-chains during step_validate. validations.sh gains checkCACert() for X.509 parse and notBefore/notAfter epoch validation, called conditionally. Ansible playbook conditionally appends --ca-pem from sslCACertificate. Docs detail when sslCACertificate is required, chain-validation rules, and the manual validation command.

Sequence Diagram(s)

sequenceDiagram
  participant bootstrap as bootstrap.sh step_validate
  participant cli as enclave tools CLI
  participant check_chains as check_certificate_chains
  participant cert_utils as cert_utils
  participant openssl as openssl subprocess

  bootstrap->>cli: check-certificate-chains --config certs_vars
  cli->>check_chains: check_certificate_chains_main(config)
  check_chains->>check_chains: read+parse YAML, extract sslCACertificate + chain fields
  loop each chain field (API, Ingress)
    check_chains->>cert_utils: pem_blocks(chain_pem)
    cert_utils-->>check_chains: list of PEM blocks
    check_chains->>cert_utils: is_self_signed(last_cert)
    cert_utils->>openssl: x509 issuer/subject
    openssl-->>cert_utils: issuer= / subject= output
    cert_utils-->>check_chains: True / False
    opt sslCACertificate present
      check_chains->>cert_utils: openssl_verify(ca_pem, last_cert)
      cert_utils->>openssl: verify -no-CAfile -no-CApath
      openssl-->>cert_utils: return code
      cert_utils-->>check_chains: True / False
    end
    check_chains-->>check_chains: collect issue strings
  end
  alt issues found
    check_chains-->>cli: raise CertificateValidationError
    cli-->>bootstrap: non-zero exit + error message
  else all pass
    check_chains-->>cli: success
    cli-->>bootstrap: "Certificate chain check passed." (stdout)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • rh-ecosystem-edge/enclave#376: Both PRs modify the step_validate function in bootstrap.sh, with the main PR adding check-certificate-chains and #376 modifying the same validation-step flow for schema validation.
  • rh-ecosystem-edge/enclave#526: Directly overlaps on quay_registry_ca.py trust-anchor derivation and resolve_registry_ca_pem behavior — the main PR replaces the logic introduced in #526 with shared cert_utils + explicit ca_pem verification.
  • rh-ecosystem-edge/enclave#531: Directly related — #531 added system-trust-store root completion to _chain_trust_anchor_pem, which this PR removes and replaces with ca_pem-based verification in the public chain_trust_anchor_pem.

Suggested labels

utilities, deployment

Suggested reviewers

  • maorfr
  • eurijon

Poem

🔐 Certificates in a chain, each link must align,
No more trusting the system store — that's a risk by design.
openssl verify -no-CAfile guards the gate,
Self-signed roots and CAs must explicitly validate.
A broken chain now fails loud, not silently at install time! 🚨


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
No-Sensitive-Data-In-Logs ❌ Error FAIL: New debug logs in quay_registry_ca.py emit the registry hostname, which can leak internal hostnames in troubleshooting logs. Remove or mask hostnames in log messages; keep trust-selection logs generic so persistent logs do not disclose tenant-specific registry names.
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Hardcoded-Secrets ✅ Passed No hardcoded secrets found: only placeholder PEM examples and cert-validation code; no real keys, tokens, passwords, or embedded creds.
No-Weak-Crypto ✅ Passed No MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or custom crypto was added; cert checks delegate to OpenSSL, and comparisons are on public certificate DNs.
No-Injection-Vectors ✅ Passed PASS: Touched code uses yaml.safe_load and argv/list-based subprocess calls; no shell=True, eval/exec, pickle.loads, os.system, or unsafe yaml.load found.
Container-Privileges ✅ Passed No changed workload manifests set privileged/root/host* or allowPrivilegeEscalation; modified YAML is playbook/config/docs, not pod specs.
Ai-Attribution ✅ Passed Head commit includes an AI attribution trailer: Assisted-by: Claude Code; no AI-related Co-Authored-By misuse was found.
Title check ✅ Passed The title matches the main change: using an operator-provided root CA to complete external registry certificate chains in UpdateService.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 531-again

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions github-actions Bot added the operators Operator installation/config label Jun 22, 2026
@rporres

rporres commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai please review

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

REPO="rh-ecosystem-edge/enclave"
PR_NUM=535

echo "=== PR Details ==="
gh pr view "$PR_NUM" --repo "$REPO" --json title,body,baseRefName,headRefName,files,state

echo ""
echo "=== PR Diff ==="
gh pr diff "$PR_NUM" --repo "$REPO"

Length of output: 30263


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== bootstrap.sh head (first 30 lines) ==="
head -50 bootstrap.sh

echo ""
echo "=== pipefail or errexit settings in bootstrap.sh ==="
grep -n "pipefail\|set -e\|set -o" bootstrap.sh || echo "(none found)"

echo ""
echo "=== certs_vars definition in bootstrap.sh ==="
grep -n "certs_vars" bootstrap.sh || echo "(none found)"

echo ""
echo "=== Full step_validate function ==="
grep -n -A 20 "step_validate" bootstrap.sh || echo "(none found)"

echo ""
echo "=== openssl availability check / dependency handling ==="
grep -rn "openssl" src/enclave/tools/ || echo "(none)"

echo ""
echo "=== _CHAIN_FIELDS in validate_certificates.py (full context) ==="
grep -n "_CHAIN_FIELDS\|sslCA\|sslAPI\|sslIngress" src/enclave/tools/validate_certificates.py

echo ""
echo "=== ca_pem flow in quay_registry_ca.py ==="
grep -n "ca_pem\|is_self_signed" src/enclave/tools/quay_registry_ca.py

Length of output: 6998



@rporres Here is the security-focused review of PR #535.


Review: Complete Certificate Chain with Root CA for External Registries

Overall risk: MEDIUM — the approach is sound and the pipefail concern I initially had is properly covered. Three medium-severity issues need attention before this goes to non-WIP.


🔴 Medium — is_self_signed() silently fails open on missing/broken openssl

File: src/enclave/tools/cert_utils.py (lines 21–39)

When openssl is not on $PATH or times out, the function silently returns False (not self-signed). In validate_certificates.py this is treated as "chain is incomplete" and raises RuntimeError, blocking the entire deployment even when the chain is actually complete. In _chain_trust_anchor_pem it silently skips the self-signed short-circuit path.

# Current behaviour
except (subprocess.TimeoutExpired, OSError):
    return False   # ← silent fail; callers can't distinguish "not self-signed" from "openssl broken"

Suggested mitigation: raise a distinct exception (or use Optional[bool]) so callers can choose to skip the check rather than acting on a false negative:

def is_self_signed(cert_pem: str) -> bool | None:
    """Return True/False, or None if openssl is unavailable/timed out."""
    try:
        result = subprocess.run(...)
    except (subprocess.TimeoutExpired, OSError) as exc:
        logger.warning("openssl unavailable, skipping self-signed check: %s", exc)
        return None   # callers must handle None explicitly

🔴 Medium — Provided ca_pem is appended to the trust bundle without chain verification

File: src/enclave/tools/quay_registry_ca.py, _chain_trust_anchor_pem() (line 118-120)

if ca_pem:
    logger.info("Completing certificate chain with provided CA")
    return "\n".join([*ca_certs, ca_pem.strip()]) + "\n"

The function trusts the caller to provide the correct CA without verifying that ca_pem actually signs the last certificate in ca_certs. A mis-configured or wrong CA will produce a silently broken trust bundle that only surfaces at TLS handshake time (inside the UpdateService pod, not during validation). The existing _openssl_verify helper is already available — consider reusing it:

if ca_pem:
    if not _openssl_verify(ca_pem, ca_certs[-1]):
        logger.warning(
            "Provided --ca-pem does not verify the last chain certificate; "
            "the trust bundle may be incorrect."
        )
    return "\n".join([*ca_certs, ca_pem.strip()]) + "\n"

🔴 Medium — sslCACertificate accepted as-is without PEM validation

File: src/enclave/tools/validate_certificates.py (line 41)

ca_pem: str = raw.get("sslCACertificate") or ""

Any non-empty string satisfies the "CA is set" guard, including a path (/etc/pki/ca.crt), a placeholder comment, or malformed PEM. The error won't surface until TLS handshake failure in the disconnected cluster. Suggest adding a quick structural check:

from enclave.tools.cert_utils import pem_blocks

ca_pem: str = raw.get("sslCACertificate") or ""
if ca_pem and not pem_blocks(ca_pem):
    issues.append(
        "sslCACertificate: value does not contain a valid PEM certificate block"
    )

🟡 Low — OSError path in is_self_signed has no unit test

File: src/tests/test_cert_utils.py

The test test_is_self_signed_returns_false_on_openssl_error covers returncode=1, but the except (subprocess.TimeoutExpired, OSError): return False branch is untested.

def test_is_self_signed_returns_false_on_os_error(mocker: MockerFixture) -> None:
    mocker.patch(
        "enclave.tools.cert_utils.subprocess.run",
        side_effect=OSError("openssl not found"),
    )
    assert is_self_signed(_ROOT) is False

🟡 Low — updateservice_replicas hardcoded to 3

File: operators/cincinnati-operator/tasks.yaml (line 3)

- name: Set UpdateService replicas
  ansible.builtin.set_fact:
    updateservice_replicas: 3

Hardcoding removes the ability to override per-environment (e.g., lab/dev with 1 replica). Consider sourcing from a variable with a default:

updateservice_replicas: "{{ updateservice_replicas | default(3) }}"

ℹ️ Nit — Duplicate timeout constant

cert_utils.py defines _OPENSSL_TIMEOUT_SECONDS = 10 and quay_registry_ca.py still keeps _OPENSSL_VERIFY_TIMEOUT_SECONDS = 10. Since cert_utils is now the shared module, consider moving/consolidating both there.


✅ Positives

  • pipefail + set -e in bootstrap.sh are correctly set; the cmd | tee pipeline will propagate failures.
  • yaml.safe_load used throughout — no arbitrary code execution risk.
  • click.Path(exists=True, dir_okay=False) for --config provides file-system validation before the tool runs.
  • Good test coverage overall; the new test_validate_certificates.py covers both field variants and the cross-field interaction.
  • The Ansible trust_quay_registry_ca_for_image_config.yaml refactor to conditional list construction is clean.

@github-actions github-actions Bot added the validation Validation and testing label Jun 22, 2026
@rporres

rporres commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Claude Code: Thank you for the review. Here is how each point was addressed:

🔴 Medium — is_self_signed() silently fails open — Addressed in 896168c. Added logger.warning("openssl unavailable: %s", exc) in the exception handler so failures are visible. Kept the bool return type rather than bool | None to avoid churn in all callers; the warning is sufficient for an operator to correlate the failure.

🔴 Medium — ca_pem appended without chain verification — Addressed in 896168c. Added a _openssl_verify(ca_pem, ca_certs[-1]) call before assembling the bundle. Implemented as a warning (not an error) so a misconfigured CA is flagged without blocking the deployment.

🔴 Medium — sslCACertificate accepted without PEM validation — Addressed in 896168c. Rather than adding the check in Python, we added a checkCACert() function to validations.sh alongside the existing checkCerts() calls — this is where all certificate validation lives. It verifies the value is parseable PEM and checks expiry. This runs at step_validate time, before the Python tool.

🟡 Low — OSError path untested — Addressed in 896168c. Added test_is_self_signed_returns_false_on_os_error using side_effect=OSError(...).

🟡 Low — updateservice_replicas hardcoded — The value was temporarily set to 3 during development but reverted to 1 in c9e313c. The 1 is intentional for this environment. Making it configurable via default(3) is a valid enhancement but is out of scope for this PR.

ℹ️ Nit — Duplicate timeout constant — Deliberate design. _OPENSSL_TIMEOUT_SECONDS in cert_utils.py covers openssl x509 (DN extraction); _OPENSSL_VERIFY_TIMEOUT_SECONDS in quay_registry_ca.py covers openssl verify (chain verification against a CA bundle, which also reads temp files). They are the same value today but semantically independent — coupling them into one constant would prevent tuning them separately if field behavior differs.

@coderabbitai approve

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@operators/cincinnati-operator/tasks.yaml`:
- Around line 89-131: The three debug tasks that print UpdateService CR status,
pod status, and pod logs are exposing sensitive information such as tokens,
endpoints, and customer data in CI/CD artifacts. Remove or conditionally gate
the debug tasks named "Show UpdateService CR status", "Show UpdateService pods
status", and "Show UpdateService pod logs" that output the full stdout content
from the respective command registrations (r_updateservice_cr_status.stdout,
r_updateservice_pods_status.stdout, r_updateservice_pod_logs.stdout). Instead,
either eliminate these debug statements entirely or restructure them to only log
high-level status information (success/failure) without exposing raw YAML or log
content.
- Around line 65-70: The until condition for the r_update_service_deployment
task is missing a critical check to ensure the Deployment controller has
observed the latest spec changes. Add a verification that
status.observedGeneration equals metadata.generation to the until condition
alongside the existing replica count checks. This ensures that the condition
only succeeds when the controller has processed the latest spec (indicated by
observedGeneration matching the current metadata.generation), preventing false
positives from stale status that existed before the spec update.
- Around line 2-5: The task "Set UpdateService replicas" hardcodes
updateservice_replicas to 1 using set_fact, which prevents inventory and group
variables from configuring environment-specific replica counts. Remove the
hardcoded set_fact entirely and instead define updateservice_replicas in
appropriate group_vars or inventory variable files for each environment, or
alternatively add a conditional to the set_fact task so it only sets the value
if the variable is not already defined from higher-precedence sources.

In `@playbooks/tasks/trust_quay_registry_ca_for_image_config.yaml`:
- Around line 14-18: The argv variable in the
trust_quay_registry_ca_for_image_config task uses a folded scalar (>-) which
causes the Jinja2 list template to render as a string representation instead of
a native list object. The ansible.builtin.command module requires argv to be a
native list. Replace the folded scalar (>-) syntax with explicit YAML list
format (using dashes for each list item) to ensure argv is properly rendered as
a native list object, similar to the pattern used elsewhere in the playbook.

In `@src/enclave/tools/cert_utils.py`:
- Around line 32-35: In the exception handler for subprocess.TimeoutExpired and
OSError (around the logger.warning call), replace the `return False` statement
with `raise RuntimeError("OpenSSL unavailable: %s" % exc)` or similar. This
makes the subprocess/environment failure explicit to callers in
validate_certificates.py and quay_registry_ca.py so they can distinguish between
actual certificate properties and detection failures, rather than conflating a
tooling error with an actual "not self-signed" result.

In `@src/enclave/tools/quay_registry_ca.py`:
- Around line 119-125: The function currently logs a warning when
_openssl_verify fails but still appends the unverified ca_pem to the trust
bundle and returns it, allowing invalid certificates to flow downstream. Replace
the warning log with a RuntimeError that should be raised immediately when
_openssl_verify returns False, ensuring the function fails fast instead of
returning an unverified certificate chain. Remove the return statement that
combines ca_certs and ca_pem, as it should only execute when verification
succeeds. Update any existing tests that expect the function to return
unverified certificates to instead expect a RuntimeError to be raised.

In `@src/enclave/tools/validate_certificates.py`:
- Around line 25-39: The certificate validation code is raising generic
RuntimeError and TypeError exceptions at multiple locations (the OSError
handler, the yaml.YAMLError handler, and the type check), which weakens the
failure contract. Create a custom exception class specifically for certificate
validation failures (such as CertificateValidationError), then replace all
instances of raising RuntimeError and TypeError throughout the file with this
custom exception while preserving the existing descriptive error messages that
reference config_path and error details. This applies to all error handlers in
the validation flow including the file read exception, YAML parse exception, and
type validation checks.
- Around line 54-66: The is_self_signed() function at line 54 returns False in
both cases where the certificate is genuinely not self-signed and when OpenSSL
is unavailable or times out, making it impossible to distinguish between a true
validation failure and a tooling failure. Refactor is_self_signed() to return a
distinct failure state (such as raising an exception or returning a three-state
value) instead of just a boolean, then update the condition block that checks
is_self_signed(certs[-1]) to handle the failure case explicitly by failing
closed (adding an issue or logging an error) when the tooling check cannot be
completed rather than proceeding with insufficient information.
- Around line 41-57: The condition on line 55 that checks `if ca_pem:` only
validates that sslCACertificate is non-empty, allowing malformed or invalid PEM
data to be treated as a valid trust anchor. This causes incomplete-chain
validation to be skipped, deferring the error to later TLS handshakes. Before
accepting ca_pem as a remediation signal, validate its structure using the
pem_blocks function (the same function already used to parse chain_pem earlier
in the loop). Modify the condition to check both that ca_pem is non-empty AND
that it contains valid PEM blocks, ensuring only properly formatted certificate
data suppresses the incomplete-chain error.

In `@src/tests/test_quay_registry_ca.py`:
- Around line 164-170: The test
`test_chain_trust_anchor_pem_appends_ca_pem_when_chain_incomplete` is not
mocking the `_openssl_verify` function, which causes it to execute a real
OpenSSL subprocess when `_chain_trust_anchor_pem` is called with the `ca_pem`
parameter. Add a mocker.patch call for `_openssl_verify` in this test (following
the same pattern used in other tests in this file) to keep the unit test
hermetic and prevent environment dependencies that can cause CI flakiness.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4c85e6a1-7e70-4994-9bf7-96b5498f52a7

📥 Commits

Reviewing files that changed from the base of the PR and between 744a77a and c9e313c.

📒 Files selected for processing (12)
  • bootstrap.sh
  • operators/cincinnati-operator/tasks.yaml
  • playbooks/tasks/trust_quay_registry_ca_for_image_config.yaml
  • src/enclave/tools/cert_utils.py
  • src/enclave/tools/cli.py
  • src/enclave/tools/quay_registry_ca.py
  • src/enclave/tools/validate_certificates.py
  • src/tests/test_cert_utils.py
  • src/tests/test_quay_registry_ca.py
  • src/tests/test_tools_cli.py
  • src/tests/test_validate_certificates.py
  • validations.sh

Comment thread operators/cincinnati-operator/tasks.yaml Outdated
Comment thread operators/cincinnati-operator/tasks.yaml Outdated
Comment thread operators/cincinnati-operator/tasks.yaml Outdated
Comment thread playbooks/tasks/trust_quay_registry_ca_for_image_config.yaml Outdated
Comment thread src/enclave/tools/cert_utils.py
Comment thread src/enclave/tools/quay_registry_ca.py Outdated
Comment thread src/enclave/tools/validate_certificates.py Outdated
Comment thread src/enclave/tools/validate_certificates.py Outdated
Comment thread src/enclave/tools/validate_certificates.py Outdated
Comment thread src/tests/test_quay_registry_ca.py Outdated
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

@rporres — Thank you for the thorough response. All medium-severity findings are resolved:

Finding Risk Disposition
is_self_signed() silent fail 🔴 Medium ✅ Warning log added in 896168c
ca_pem appended without verification 🔴 Medium _openssl_verify guard + warning added in 896168c
sslCACertificate no PEM validation 🔴 Medium checkCACert() in validations.sh added in 896168c
OSError path untested 🟡 Low ✅ Test added in 896168c
updateservice_replicas hardcoded 🟡 Low Acknowledged — 1 is intentional; enhancement deferred ✅
Duplicate timeout constant ℹ️ Nit Acknowledged — semantic independence rationale accepted ✅

No residual medium or high-risk items. Proceeding with approval.

🐇 ✨

✅ Action performed

Comments resolved and changes approved.

@rporres rporres force-pushed the 531-again branch 2 times, most recently from deac5f8 to 5991aed Compare June 24, 2026 17:25
@github-actions github-actions Bot added deployment Deployment-related changes and removed operators Operator installation/config labels Jun 24, 2026
@rporres rporres changed the title [WIP] complete certificate chain with root CA for external registries Complete certificate chain with root CA for external registries Jun 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/CONFIGURATION_REFERENCE.md`:
- Around line 1092-1111: The markdown in the certificate-chain validation
section violates MD031 because the fenced examples are not preceded by blank
lines. Update the documentation around the existing example blocks in
CONFIGURATION_REFERENCE.md so there is a blank line before each fenced block,
including the yaml example under the “Example” label and the bash command under
“To validate manually”; keep the surrounding headings and text unchanged.

In `@src/enclave/tools/cert_utils.py`:
- Around line 63-87: The self-signed check in is_self_signed() is currently
based only on issuer/subject DN equality, which can incorrectly mark a crafted
certificate as complete. Keep the existing DN comparison as a fast precheck, but
add a real self-signature verification step in the same function before
returning True, using a cryptographic check such as an openssl verify flow with
-check_ss_sig. Update the logic so chain_trust_anchor_pem() only treats a cert
as self-signed when is_self_signed() confirms both the DN match and the
signature validity.

In `@src/enclave/tools/cli.py`:
- Around line 47-61: The certificates.yaml handling in cli.py can crash with an
AttributeError when yaml.safe_load returns a non-mapping value, because the code
calls raw.get in the certificates_config block without verifying the parsed
object type. Update the certificates_config parsing logic to validate that raw
is a dict before accessing sslCACertificate, and if it is not, raise a
ClickException with a clear message; mirror the isinstance(raw, dict) safeguard
already used in check_certificate_chains to keep the CLI error path user-facing
and consistent.

In `@src/tests/test_tools_cli.py`:
- Around line 75-120: The resolve-quay-registry-ca coverage is missing a
regression test for a non-mapping certificates.yaml input, which is the crash
case in cli.py. After adding the isinstance(raw, dict) guard in
resolve_quay_registry_ca, add a test alongside the existing CliRunner cases that
writes a YAML sequence/scalar file, invokes the resolve-quay-registry-ca
command, and asserts a non-zero exit plus a clear error message mentioning the
expected YAML mapping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: be5ffd87-eed4-499c-9f4e-b773231f1520

📥 Commits

Reviewing files that changed from the base of the PR and between deac5f8 and 5991aed.

📒 Files selected for processing (14)
  • bootstrap.sh
  • config/certificates.example.yaml
  • docs/CONFIGURATION_REFERENCE.md
  • docs/DEPLOYMENT_GUIDE.md
  • playbooks/tasks/trust_quay_registry_ca_for_image_config.yaml
  • src/enclave/tools/cert_utils.py
  • src/enclave/tools/check_certificate_chains.py
  • src/enclave/tools/cli.py
  • src/enclave/tools/quay_registry_ca.py
  • src/tests/test_cert_utils.py
  • src/tests/test_check_certificate_chains.py
  • src/tests/test_quay_registry_ca.py
  • src/tests/test_tools_cli.py
  • validations.sh

Comment thread docs/CONFIGURATION_REFERENCE.md
Comment thread src/enclave/tools/cert_utils.py
Comment thread src/enclave/tools/cli.py
Comment thread src/tests/test_tools_cli.py
The previous TLS chain completion for resolve-quay-registry-ca relied on
scanning the LZ system trust store: O(n) openssl subprocess calls per CA
in the bundle, silent failure for custom/private CAs absent from the store,
and a result dependent on trust store state that may be stale in
long-running disconnected deployments.

Replace the system store lookup with an explicit CA from sslCACertificate,
passed via --certificates-config (reads from certificates.yaml) or --ca-pem
(inline PEM) on resolve-quay-registry-ca; the options are mutually
exclusive. The Ansible task that invokes the command is updated accordingly.

Add check-certificate-chains (enclave tools subcommand) to validate chain
completeness and CA consistency before installation:
- chain ends with a non-self-signed certificate and sslCACertificate is
  absent → error
- only a leaf certificate present and sslCACertificate is absent → error
- chain ends with a self-signed root but sslCACertificate does not verify
  it → error (CA hierarchy mismatch)

Certificate utilities (openssl_verify, is_self_signed, pem_blocks) are
consolidated in cert_utils.py for reuse across tools. validations.sh gains
checkCACert() for PEM format and expiry checks. Documentation and example
config updated to clarify when sslCACertificate is required and how chain
validation works.

Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
Assisted-by: Claude Code <noreply@anthropic.com>
@github-actions github-actions Bot added the ci-cd CI/CD infrastructure label Jun 24, 2026
Signed-off-by: Rafa Porres Molina <rporresm@redhat.com>
@rporres rporres changed the title Complete certificate chain with root CA for external registries Complete certificate chain with root CA for external registries in UpdateService Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cd CI/CD infrastructure deployment Deployment-related changes validation Validation and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant