perf/security: cachekit-core Phase 1 bedrock (profile, benches, nonce seed)#36
perf/security: cachekit-core Phase 1 bedrock (profile, benches, nonce seed)#3627Bslash6 wants to merge 6 commits into
Conversation
Cargo only honors the ROOT package's [profile.release]. Workspace
consumers (cachekit-rs, cachekit-py via PyO3, cachekit-ts via NAPI)
define their own [profile.release] and use that instead — this profile
does NOT propagate to them.
The value of setting it here is threefold:
1. `cargo bench` from inside cachekit-core runs under a stable, fully
optimized profile. Phase 3 perf measurement depends on this — under
Cargo defaults (codegen-units=16, lto=false) the bench numbers
would be ~15% noisier than under the standardized profile.
2. `cargo build --release` and `cargo publish` dry-runs produce a
consistently optimized artifact for the crate's own CI gates.
3. Standalone binary consumers WITHOUT their own [profile.release]
inherit this one instead of falling back to Cargo defaults.
Mirrors cachekit-py/rust/Cargo.toml: opt-level=3, lto=fat,
codegen-units=1, panic=abort, strip=symbols, debug=line-tables-only.
Covers byte_storage store/retrieve roundtrip and AES-GCM encrypt/decrypt roundtrip across 64B-64KB payload sizes. Unblocks PGO training on the real hot-path distribution (was previously training on the test suite) and provides the baseline for upcoming Phase 3 perf work (single-alloc encrypt path, StorageEnvelope::new(&[u8]), HKDF cache). Run: make bench (full) or make bench-quick (CI).
The preceding test: commit added `criterion = "0.5"` to dev-deps but omitted the lockfile update. `cargo bench` regenerated Cargo.lock locally during verification (30+ new transitive crates: anes, ciborium, clap_*, plotters, tinytemplate, etc.), but the changes weren't staged. This commit locks the resolved versions so clean clones produce byte-identical bench builds without re-resolving.
…H-3) Previously the GLOBAL_INSTANCE_COUNTER (native) and WASM_INSTANCE_COUNTER (wasm32) were seeded with only 4 random bytes, then shifted left by 32. This gave the lower 32 bits an initial value of 0 in every process, and only 2^32 cross-process collision resistance — reachable on workloads that churn many isolates (Cloudflare Workers) within days. Now seeded with 8 random bytes via u64::from_be_bytes, giving the full u64 entropy space. Collision probability for any realistic fleet size falls below 2^-32. Wire format unchanged: nonces remain [instance_id(8)][counter(4)] = 12 bytes. Only the seed entropy of instance_id increases. Regression test: test_instance_seed_uses_8_random_bytes asserts the counter's lower 32 bits show entropy at process start (pre-fix they were always small integers; post-fix they are randomly distributed).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR adds a Criterion-based hot-path benchmark suite (ByteStorage and encryption), updates build and Makefile bench targets, hardens encryptor instance counter seeding to use full 8-byte randomness on native and wasm, and updates cargo-vet supply-chain audits/imports and exemption policies. ChangesBenchmarking Infrastructure and Suite
Instance Seeding Security Hardening
Supply-chain audit and cargo-vet config
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
Makefile (1)
79-84: ⚡ Quick winAdd
require_binaryguard to new bench targets for consistent tool checks.
benchandbench-quickcurrently skip the preflight binary check pattern used elsewhere in this Makefile.Suggested patch
bench: ## Run Criterion benches (uses --features encryption) - cargo bench -p cachekit-core --features encryption --bench hot_path + $(call require_binary,cargo,Install Rust: https://rustup.rs) + `@cargo` bench -p cachekit-core --features encryption --bench hot_path bench-quick: ## Quick bench run for CI (1s warmup, 2s measurement, 10 samples) - cargo bench -p cachekit-core --features encryption --bench hot_path -- --warm-up-time 1 --measurement-time 2 --sample-size 10 + $(call require_binary,cargo,Install Rust: https://rustup.rs) + `@cargo` bench -p cachekit-core --features encryption --bench hot_path -- --warm-up-time 1 --measurement-time 2 --sample-size 10🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Makefile` around lines 79 - 84, The new Makefile targets bench and bench-quick lack the preflight binary check used elsewhere; update both targets (bench and bench-quick) to invoke the existing require_binary guard (e.g., call require_binary for the cargo binary) before running cargo bench so the same tool availability checks are applied consistently across targets.src/encryption/core.rs (1)
1129-1140: 💤 Low valueConsider a more deterministic approach for the regression test.
The current test has two acknowledged weaknesses:
- False failures (~1 in 4.3M runs): If the random seed's low 32 bits happen to be < 1000, the test fails spuriously.
- False passes: If >1000 encryptors are created by tests running before this one, the test passes even with the 4-byte bug.
For a security regression test, consider a more deterministic approach using a helper that exposes the raw seed before any increments, or using a compile-time feature flag to inject a known seed for testing. Alternatively, lower the threshold (e.g., 100) to reduce false-fail probability to ~2.3e-8 while still catching the regression pattern.
That said, the current approach is practical and the edge cases are documented.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/encryption/core.rs` around lines 1129 - 1140, The test test_instance_seed_uses_8_random_bytes is nondeterministic because it inspects GLOBAL_INSTANCE_COUNTER after calling ZeroKnowledgeEncryptor::new(), causing rare false failures or passes; fix by making the seed deterministic for tests—add a test-only accessor or init path (e.g., cfg(test) helper function get_raw_instance_seed() or a feature-guarded constructor like ZeroKnowledgeEncryptor::with_seed_for_test) that returns the raw seed before any increments and assert its width/entropy deterministically, or as a simpler quick fix lower the numeric threshold in test_instance_seed_uses_8_random_bytes (e.g., to 100) to drastically reduce false-fail probability while keeping the regression check.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Makefile`:
- Around line 79-84: The new Makefile targets bench and bench-quick lack the
preflight binary check used elsewhere; update both targets (bench and
bench-quick) to invoke the existing require_binary guard (e.g., call
require_binary for the cargo binary) before running cargo bench so the same tool
availability checks are applied consistently across targets.
In `@src/encryption/core.rs`:
- Around line 1129-1140: The test test_instance_seed_uses_8_random_bytes is
nondeterministic because it inspects GLOBAL_INSTANCE_COUNTER after calling
ZeroKnowledgeEncryptor::new(), causing rare false failures or passes; fix by
making the seed deterministic for tests—add a test-only accessor or init path
(e.g., cfg(test) helper function get_raw_instance_seed() or a feature-guarded
constructor like ZeroKnowledgeEncryptor::with_seed_for_test) that returns the
raw seed before any increments and assert its width/entropy deterministically,
or as a simpler quick fix lower the numeric threshold in
test_instance_seed_uses_8_random_bytes (e.g., to 100) to drastically reduce
false-fail probability while keeping the regression check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: afad0086-b4c7-44f0-849c-26e6ad7254b7
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
Cargo.tomlMakefilebenches/hot_path.rssrc/encryption/core.rs
CodeRabbit flagged that the new `bench` and `bench-quick` Makefile targets skipped the `require_binary` preflight check used by every other target in this file. Added the guard plus the `@cargo` prefix to suppress command echo, matching the surrounding style. The fuller style normalization (banner echo lines, success markers) is deliberately deferred — out of scope for this review and tracked as a separate cosmetic followup. CodeRabbit-Resolved: Makefile:79:Add `require_binary` guard to new
…ptions The criterion dev-dep added in the preceding test: commit brought in ~30 transitive crates that weren't covered by cargo-vet, causing the non-required Cargo Vet (Supply Chain) CI check to fail. Imported 6 canonical public audit sources (google, mozilla, bytecode-alliance, isrg, zcash, fermyon), trusted the publishers cargo-vet suggested as already cross-trusted by those orgs (sunfishcode, BurntSushi, cuviper, alexcrichton, starkat99), then regenerated exemptions for the residual safe-to-run gaps (criterion's bench-time-only transitive tree: plotters, web-sys diff, criterion-plot, etc.). Net effect: config.toml shrinks by ~250 lines (existing hand-managed exemptions now satisfied by audits in the imported peers); imports.lock gains the locked audit references. cargo vet now reports: 83 fully audited, 5 partially audited, 98 exempted. Note: Phase 2 of the audit-driven cleanup deletes the supply-chain/ machinery entirely (per the catchphrase-agent recommendation — cargo-audit + cargo-deny already cover CVEs and licenses). This commit just unblocks the failing check on PR #36 so it ships clean.
|
@coderabbitai can you approve |
|
✅ Actions performedComments resolved. Approval is disabled; enable |
Summary
Phase 1 of the cachekit-core audit-driven cleanup. Three independent changes that unblock Phases 2 and 3:
[profile.release]for cachekit-core's own release builds (a3abff2). Cargo only honors the root package's profile, so workspace consumers (cachekit-rs, cachekit-py via PyO3, cachekit-ts via NAPI) use their own profile and override this one — butcargo benchinside this crate,cargo publishdry-runs, and standalone binary consumers without their own profile now get LTO + codegen-units=1 + panic=abort + strip + line-tables-only debug. Profile mirrorscachekit-py/rust/Cargo.toml. Binary size: 397KB → 315KB (-20.7%).Criterion bench scaffold (
761c327+1d01174). Newbenches/hot_path.rscovers byte_storage store/retrieve roundtrip and AES-GCM encrypt/decrypt roundtrip across 64B–64KB. Replaces the one-shotexamples/bench_throughput.rsstdout script. Unblocks PGO training on the real hot-path distribution and lets Phase 3 perf claims be measured statistically.make bench(full) andmake bench-quick(CI, 1s warmup, 2s measurement, 10 samples) targets added. Smoke run: AES-GCM 64KB roundtrip ≈ 1.3 GiB/s on this machine. The[[bench]]section + criterion dev-dep + locked transitive deps land cleanly.8-byte nonce seed fix (
4119465) — security: HIGH-2/HIGH-3 from the 2026-05-29 audit. PreviouslyGLOBAL_INSTANCE_COUNTER(native) andWASM_INSTANCE_COUNTER(wasm32) were seeded with only 4 random bytes shifted into the high half, leaving the lower 32 bits at zero at process start. Cross-process collision probability was reachable on Cloudflare Workers within days. Now seeded with 8 random bytes viau64::from_be_bytes. Regression testtest_instance_seed_uses_8_random_bytesasserts the low 32 bits show entropy at process start (pre-fix this test fails withlow32 = 1; post-fix passes). Wire format unchanged — nonces remain[instance_id(8)][counter(4)]= 12 bytes, only the seed entropy ofinstance_idincreases.Test plan
cargo fmt --check,cargo clippy --all-features -- -D warnings,cargo test --all-features(172+ tests),cargo deny check,cargo audit(0 vulns across 186 deps)cargo build --release -p cachekit-core -vshows-C opt-level=3 -C codegen-units=1 -C panic=abort -C strip=symbols -C debuginfo=line-tables-only(LTO operates at link time, confirmed via 20.7% binary shrinkage)cargo build -p cachekit-core --target wasm32-unknown-unknown --features encryptionsucceedsmake bench-quickruns to completion, prints throughput numbers for both bench groups across all 6 size bucketstest_instance_seed_uses_8_random_bytesfailed pre-fix (low32 = 1, the only value the old(rand_u32 << 32)seed could produce after one encryptor) and passes post-fixKnown follow-ups (out of scope for this PR)
Flagged by the code-quality review as APPROVED-WITH-NITS for
4119465. Not blocking — captured here so they don't get lost:core.rswould silently make the test useless on pre-fix code. Robust fix: factor seed generation into a pure function taking&dyn Fn() -> [u8; 8]and assert exact byte consumption — defer to a hardening follow-up.cargo vetfails locally (not in CI) because criterion brought in ~30 new transitive deps not insupply-chain/audits.toml. Phase 2 deletes the entiresupply-chain/machinery (per audit's catchphrase-agent recommendation), so this resolves itself in the next phase. CI gate is satisfied bycargo deny+cargo audit, both green.bench/bench-quicktargets break the surrounding style (no@-prefix, no$(BLUE)…$(RESET)banner, norequire_binarygate). Cosmetic — easy follow-up.Audit reference
This PR addresses items 1–3 of the cachekit-core 2026-05-29 parallel audit synthesis (security-specialist + deep-executor + catchphrase-agent). Phase 2 (deletions) and Phase 3 (hot-path) plans are at
tooling/sessions/tasks/h-cachekit-core-phase-{2-deletions,3-hot-path}.md.Summary by CodeRabbit
Security Improvements
Chores