Skip to content

perf/security: cachekit-core Phase 1 bedrock (profile, benches, nonce seed)#36

Open
27Bslash6 wants to merge 6 commits into
mainfrom
perf/phase-1-bedrock
Open

perf/security: cachekit-core Phase 1 bedrock (profile, benches, nonce seed)#36
27Bslash6 wants to merge 6 commits into
mainfrom
perf/phase-1-bedrock

Conversation

@27Bslash6
Copy link
Copy Markdown
Contributor

@27Bslash6 27Bslash6 commented May 29, 2026

Summary

Phase 1 of the cachekit-core audit-driven cleanup. Three independent changes that unblock Phases 2 and 3:

  1. [profile.release] for cachekit-core's own release builds (a3abff2). Cargo only honors the root package's profile, so workspace consumers (cachekit-rs, cachekit-py via PyO3, cachekit-ts via NAPI) use their own profile and override this one — but cargo bench inside this crate, cargo publish dry-runs, and standalone binary consumers without their own profile now get LTO + codegen-units=1 + panic=abort + strip + line-tables-only debug. Profile mirrors cachekit-py/rust/Cargo.toml. Binary size: 397KB → 315KB (-20.7%).

  2. Criterion bench scaffold (761c327 + 1d01174). New benches/hot_path.rs covers byte_storage store/retrieve roundtrip and AES-GCM encrypt/decrypt roundtrip across 64B–64KB. Replaces the one-shot examples/bench_throughput.rs stdout script. Unblocks PGO training on the real hot-path distribution and lets Phase 3 perf claims be measured statistically. make bench (full) and make bench-quick (CI, 1s warmup, 2s measurement, 10 samples) targets added. Smoke run: AES-GCM 64KB roundtrip ≈ 1.3 GiB/s on this machine. The [[bench]] section + criterion dev-dep + locked transitive deps land cleanly.

  3. 8-byte nonce seed fix (4119465) — security: HIGH-2/HIGH-3 from the 2026-05-29 audit. Previously GLOBAL_INSTANCE_COUNTER (native) and WASM_INSTANCE_COUNTER (wasm32) were seeded with only 4 random bytes shifted into the high half, leaving the lower 32 bits at zero at process start. Cross-process collision probability was reachable on Cloudflare Workers within days. Now seeded with 8 random bytes via u64::from_be_bytes. Regression test test_instance_seed_uses_8_random_bytes asserts the low 32 bits show entropy at process start (pre-fix this test fails with low32 = 1; post-fix passes). Wire format unchanged — nonces remain [instance_id(8)][counter(4)] = 12 bytes, only the seed entropy of instance_id increases.

Test plan

  • All five CI-gating checks green: cargo fmt --check, cargo clippy --all-features -- -D warnings, cargo test --all-features (172+ tests), cargo deny check, cargo audit (0 vulns across 186 deps)
  • cargo build --release -p cachekit-core -v shows -C opt-level=3 -C codegen-units=1 -C panic=abort -C strip=symbols -C debuginfo=line-tables-only (LTO operates at link time, confirmed via 20.7% binary shrinkage)
  • cargo build -p cachekit-core --target wasm32-unknown-unknown --features encryption succeeds
  • make bench-quick runs to completion, prints throughput numbers for both bench groups across all 6 size buckets
  • TDD verified: test_instance_seed_uses_8_random_bytes failed pre-fix (low32 = 1, the only value the old (rand_u32 << 32) seed could produce after one encryptor) and passes post-fix

Known follow-ups (out of scope for this PR)

Flagged by the code-quality review as APPROVED-WITH-NITS for 4119465. Not blocking — captured here so they don't get lost:

  • Math claim is slightly over-stated. "Collision probability for any realistic fleet size falls below 2^-32" is true up to ~92k concurrent process starts (birthday bound). At 1M starts the probability is ~2^-25 — still a huge improvement over the 4-byte path, but the "any realistic fleet size" framing is sloppy. Tighten the doc-comment in a small follow-up.
  • Regression test threshold (1000) has only ~13× margin over current test-induced encryptor count (~72). Adding ~900 more encryptor-creating tests to core.rs would silently make the test useless on pre-fix code. Robust fix: factor seed generation into a pure function taking &dyn Fn() -> [u8; 8] and assert exact byte consumption — defer to a hardening follow-up.
  • cargo vet fails locally (not in CI) because criterion brought in ~30 new transitive deps not in supply-chain/audits.toml. Phase 2 deletes the entire supply-chain/ machinery (per audit's catchphrase-agent recommendation), so this resolves itself in the next phase. CI gate is satisfied by cargo deny + cargo audit, both green.
  • Makefile bench / bench-quick targets break the surrounding style (no @-prefix, no $(BLUE)…$(RESET) banner, no require_binary gate). Cosmetic — easy follow-up.

Audit reference

This PR addresses items 1–3 of the cachekit-core 2026-05-29 parallel audit synthesis (security-specialist + deep-executor + catchphrase-agent). Phase 2 (deletions) and Phase 3 (hot-path) plans are at tooling/sessions/tasks/h-cachekit-core-phase-{2-deletions,3-hot-path}.md.

Summary by CodeRabbit

  • Security Improvements

    • Strengthened nonce seeding for encryptors to improve uniqueness across processes.
    • Enhanced supply-chain audit and policy configurations to tighten provenance and vetting.
  • Chores

    • Added repeatable benchmarking targets and harness to measure performance over time.

Review Change Stack

27Bslash6 added 4 commits May 29, 2026 21:52
Cargo only honors the ROOT package's [profile.release]. Workspace
consumers (cachekit-rs, cachekit-py via PyO3, cachekit-ts via NAPI)
define their own [profile.release] and use that instead — this profile
does NOT propagate to them.

The value of setting it here is threefold:

  1. `cargo bench` from inside cachekit-core runs under a stable, fully
     optimized profile. Phase 3 perf measurement depends on this — under
     Cargo defaults (codegen-units=16, lto=false) the bench numbers
     would be ~15% noisier than under the standardized profile.

  2. `cargo build --release` and `cargo publish` dry-runs produce a
     consistently optimized artifact for the crate's own CI gates.

  3. Standalone binary consumers WITHOUT their own [profile.release]
     inherit this one instead of falling back to Cargo defaults.

Mirrors cachekit-py/rust/Cargo.toml: opt-level=3, lto=fat,
codegen-units=1, panic=abort, strip=symbols, debug=line-tables-only.
Covers byte_storage store/retrieve roundtrip and AES-GCM
encrypt/decrypt roundtrip across 64B-64KB payload sizes.

Unblocks PGO training on the real hot-path distribution (was
previously training on the test suite) and provides the baseline
for upcoming Phase 3 perf work (single-alloc encrypt path,
StorageEnvelope::new(&[u8]), HKDF cache).

Run: make bench (full) or make bench-quick (CI).
The preceding test: commit added `criterion = "0.5"` to dev-deps but
omitted the lockfile update. `cargo bench` regenerated Cargo.lock locally
during verification (30+ new transitive crates: anes, ciborium, clap_*,
plotters, tinytemplate, etc.), but the changes weren't staged.

This commit locks the resolved versions so clean clones produce
byte-identical bench builds without re-resolving.
…H-3)

Previously the GLOBAL_INSTANCE_COUNTER (native) and WASM_INSTANCE_COUNTER
(wasm32) were seeded with only 4 random bytes, then shifted left by 32.
This gave the lower 32 bits an initial value of 0 in every process, and
only 2^32 cross-process collision resistance — reachable on workloads
that churn many isolates (Cloudflare Workers) within days.

Now seeded with 8 random bytes via u64::from_be_bytes, giving the full
u64 entropy space. Collision probability for any realistic fleet size
falls below 2^-32.

Wire format unchanged: nonces remain [instance_id(8)][counter(4)] = 12
bytes. Only the seed entropy of instance_id increases.

Regression test: test_instance_seed_uses_8_random_bytes asserts the
counter's lower 32 bits show entropy at process start (pre-fix they
were always small integers; post-fix they are randomly distributed).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c0968dd-c31d-4afc-8702-5c1824e95b69

📥 Commits

Reviewing files that changed from the base of the PR and between 4119465 and 2088e34.

⛔ Files ignored due to path filters (1)
  • supply-chain/imports.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • Makefile
  • supply-chain/audits.toml
  • supply-chain/config.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • Makefile

📝 Walkthrough

Walkthrough

This PR adds a Criterion-based hot-path benchmark suite (ByteStorage and encryption), updates build and Makefile bench targets, hardens encryptor instance counter seeding to use full 8-byte randomness on native and wasm, and updates cargo-vet supply-chain audits/imports and exemption policies.

Changes

Benchmarking Infrastructure and Suite

Layer / File(s) Summary
Criterion Benchmark Configuration
Cargo.toml, Makefile
criterion dev-dependency with html_reports is added; release profile tuned and [[bench]] registers hot_path. Makefile targets bench and bench-quick run cargo bench --bench hot_path with optional quick parameters.
Hot Path Benchmark Suite
benches/hot_path.rs
Implements deterministic payload generation, ByteStorage store/retrieve throughput benchmarks, and ZeroKnowledgeEncryptor AES‑GCM encrypt/decrypt throughput benchmarks; registers group with Criterion.

Instance Seeding Security Hardening

Layer / File(s) Summary
Instance Counter Seeding Updates
src/encryption/core.rs
Native GLOBAL_INSTANCE_COUNTER and wasm WASM_INSTANCE_COUNTER are seeded from 8 random bytes (full u64) instead of the prior 4‑byte derived seed.
Instance Seeding Regression Test
src/encryption/core.rs
Adds test_instance_seed_uses_8_random_bytes asserting the lower 32 bits are unlikely to remain a small count-like value after initialization.

Supply-chain audit and cargo-vet config

Layer / File(s) Summary
Supply-chain audits file
supply-chain/audits.toml
Adds audits.toml with multiple trusted entries, criteria, user-ids, and shared end date.
cargo-vet imports and exemptions
supply-chain/config.toml
Adds remote [imports.*] sources; substantially rewrites [[exemptions.*]] blocks (adds criterion/plotters/crossbeam/etc., removes/adjusts many other exemptions, and changes some criteria/versions).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • cachekit-io/cachekit-core#32: Also modifies supply-chain/config.toml cargo-vet exemption policy entries overlapping in scope with this PR's supply-chain changes.

Poem

🐰 I hopped through benches, bytes, and keys,
Eight bytes of seed in nonce-born breeze,
Storage races, ciphers hum their tune,
Supply chains listed under the moon.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title comprehensively summarizes all three main changes: release profile optimization, benchmark scaffolding, and nonce seed security fix.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/phase-1-bedrock

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
Makefile (1)

79-84: ⚡ Quick win

Add require_binary guard to new bench targets for consistent tool checks.

bench and bench-quick currently skip the preflight binary check pattern used elsewhere in this Makefile.

Suggested patch
 bench: ## Run Criterion benches (uses --features encryption)
-	cargo bench -p cachekit-core --features encryption --bench hot_path
+	$(call require_binary,cargo,Install Rust: https://rustup.rs)
+	`@cargo` bench -p cachekit-core --features encryption --bench hot_path
 
 bench-quick: ## Quick bench run for CI (1s warmup, 2s measurement, 10 samples)
-	cargo bench -p cachekit-core --features encryption --bench hot_path -- --warm-up-time 1 --measurement-time 2 --sample-size 10
+	$(call require_binary,cargo,Install Rust: https://rustup.rs)
+	`@cargo` bench -p cachekit-core --features encryption --bench hot_path -- --warm-up-time 1 --measurement-time 2 --sample-size 10
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 79 - 84, The new Makefile targets bench and
bench-quick lack the preflight binary check used elsewhere; update both targets
(bench and bench-quick) to invoke the existing require_binary guard (e.g., call
require_binary for the cargo binary) before running cargo bench so the same tool
availability checks are applied consistently across targets.
src/encryption/core.rs (1)

1129-1140: 💤 Low value

Consider a more deterministic approach for the regression test.

The current test has two acknowledged weaknesses:

  1. False failures (~1 in 4.3M runs): If the random seed's low 32 bits happen to be < 1000, the test fails spuriously.
  2. False passes: If >1000 encryptors are created by tests running before this one, the test passes even with the 4-byte bug.

For a security regression test, consider a more deterministic approach using a helper that exposes the raw seed before any increments, or using a compile-time feature flag to inject a known seed for testing. Alternatively, lower the threshold (e.g., 100) to reduce false-fail probability to ~2.3e-8 while still catching the regression pattern.

That said, the current approach is practical and the edge cases are documented.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/encryption/core.rs` around lines 1129 - 1140, The test
test_instance_seed_uses_8_random_bytes is nondeterministic because it inspects
GLOBAL_INSTANCE_COUNTER after calling ZeroKnowledgeEncryptor::new(), causing
rare false failures or passes; fix by making the seed deterministic for
tests—add a test-only accessor or init path (e.g., cfg(test) helper function
get_raw_instance_seed() or a feature-guarded constructor like
ZeroKnowledgeEncryptor::with_seed_for_test) that returns the raw seed before any
increments and assert its width/entropy deterministically, or as a simpler quick
fix lower the numeric threshold in test_instance_seed_uses_8_random_bytes (e.g.,
to 100) to drastically reduce false-fail probability while keeping the
regression check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Makefile`:
- Around line 79-84: The new Makefile targets bench and bench-quick lack the
preflight binary check used elsewhere; update both targets (bench and
bench-quick) to invoke the existing require_binary guard (e.g., call
require_binary for the cargo binary) before running cargo bench so the same tool
availability checks are applied consistently across targets.

In `@src/encryption/core.rs`:
- Around line 1129-1140: The test test_instance_seed_uses_8_random_bytes is
nondeterministic because it inspects GLOBAL_INSTANCE_COUNTER after calling
ZeroKnowledgeEncryptor::new(), causing rare false failures or passes; fix by
making the seed deterministic for tests—add a test-only accessor or init path
(e.g., cfg(test) helper function get_raw_instance_seed() or a feature-guarded
constructor like ZeroKnowledgeEncryptor::with_seed_for_test) that returns the
raw seed before any increments and assert its width/entropy deterministically,
or as a simpler quick fix lower the numeric threshold in
test_instance_seed_uses_8_random_bytes (e.g., to 100) to drastically reduce
false-fail probability while keeping the regression check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: afad0086-b4c7-44f0-849c-26e6ad7254b7

📥 Commits

Reviewing files that changed from the base of the PR and between f3a2794 and 4119465.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • Cargo.toml
  • Makefile
  • benches/hot_path.rs
  • src/encryption/core.rs

27Bslash6 added 2 commits May 29, 2026 22:18
CodeRabbit flagged that the new `bench` and `bench-quick` Makefile
targets skipped the `require_binary` preflight check used by every
other target in this file. Added the guard plus the `@cargo` prefix
to suppress command echo, matching the surrounding style.

The fuller style normalization (banner echo lines, success markers)
is deliberately deferred — out of scope for this review and tracked
as a separate cosmetic followup.

CodeRabbit-Resolved: Makefile:79:Add `require_binary` guard to new
…ptions

The criterion dev-dep added in the preceding test: commit brought in
~30 transitive crates that weren't covered by cargo-vet, causing the
non-required Cargo Vet (Supply Chain) CI check to fail.

Imported 6 canonical public audit sources (google, mozilla,
bytecode-alliance, isrg, zcash, fermyon), trusted the publishers
cargo-vet suggested as already cross-trusted by those orgs
(sunfishcode, BurntSushi, cuviper, alexcrichton, starkat99), then
regenerated exemptions for the residual safe-to-run gaps (criterion's
bench-time-only transitive tree: plotters, web-sys diff, criterion-plot,
etc.).

Net effect: config.toml shrinks by ~250 lines (existing hand-managed
exemptions now satisfied by audits in the imported peers); imports.lock
gains the locked audit references. cargo vet now reports: 83 fully
audited, 5 partially audited, 98 exempted.

Note: Phase 2 of the audit-driven cleanup deletes the supply-chain/
machinery entirely (per the catchphrase-agent recommendation —
cargo-audit + cargo-deny already cover CVEs and licenses). This commit
just unblocks the failing check on PR #36 so it ships clean.
@27Bslash6
Copy link
Copy Markdown
Contributor Author

@coderabbitai can you approve

@27Bslash6 27Bslash6 enabled auto-merge (squash) May 30, 2026 00:36
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

@27Bslash6 Sure! Approving the PR now.

✅ Actions performed

Comments resolved. Approval is disabled; enable reviews.request_changes_workflow to allow explicit top-level @coderabbitai resolve or @coderabbitai approve commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant