Add policy and schema caching via cache()/cacheKey() API#350
Add policy and schema caching via cache()/cacheKey() API#350muditchaudhary merged 5 commits intocedar-policy:mainfrom
Conversation
902cceb to
5ae66b9
Compare
Adds Rust-side caching of pre-parsed policies and schemas, integrated
directly into PolicySet and Schema. When cached, BasicAuthorizationEngine
transparently uses a fast stateful authorization path that skips
re-parsing — no separate engine class or new authorization flow required.
Usage:
policySet.cache();
schema.cache();
var engine = new BasicAuthorizationEngine();
for (AuthorizationRequest req : requests) {
engine.isAuthorized(req, policySet, entities); // fast cached path
}
// Cached data is freed automatically when the object is GC'd.
API on PolicySet and Schema:
- cache(): pre-parse and cache on the Rust side. One-way, idempotent.
To use different policies/schema, create a new instance.
- cacheKey(): returns Optional<String> cache ID, empty if not cached.
Both policy set and schema must be cached for the fast path to be used.
If only one is cached, authorization falls back to the uncached path
to avoid silently skipping schema validation.
Implementation:
Rust (CedarJavaFFI):
- Shared DashMap cache for parsed PolicySets and Schemas, accessible
from any thread. Direct JNI entry points for cache removal (no JSON
dispatch overhead).
- Stateful authorization operation that looks up cached policies/schema
by ID and authorizes against them.
Java:
- PolicySet and Schema gain cache()/cacheKey() methods.
- BasicAuthorizationEngine.isAuthorized() checks for cached inputs and
dispatches to the stateful path transparently.
- GC-based cleanup via java.lang.ref.Cleaner frees Rust cache entries.
- SharedCedarInternals centralizes the Cleaner instance and JNI access.
NativeHelpers remains package-private.
Benchmarks:
- Full cross-product matrix: 4 policy sizes x 3 entity sizes, cached
and uncached. Run via: ./gradlew jmh
Signed-off-by: Chris Simmons <simmonsc@amazon.com>
5ae66b9 to
81654a8
Compare
| String input = objectWriter().writeValueAsString( | ||
| new PreparsePolicySetRequest(id, this)); | ||
| String response = SharedCedarInternals.callCedarJNI("PreparsePolicySet", input); | ||
| JsonNode node = new com.fasterxml.jackson.databind.ObjectMapper().readTree(response); |
There was a problem hiding this comment.
Nit: Replace fully-qualified inline usage of ObjectMapper with import.
import com.fasterxml.jackson.databind.ObjectMapper;
| static AUTHORIZER: Authorizer = Authorizer::new(); | ||
| } | ||
|
|
||
| static CACHED_POLICY_SETS: LazyLock<DashMap<String, PolicySet>> = LazyLock::new(DashMap::new); |
There was a problem hiding this comment.
We should have some cache eviction policy in place. Otherwise, if someone caches a lot of long-lived policy sets, we run into a risk of running out of memory. It won't be a Java OOM "graceful" error but probably a process killed error if CACHED_POLICY_SETS or CACHED_SCHEMAS grow too much. Unlikely for most use-cases but we should have it in place. Maybe give user some control over the eviction policy (e.g., TTL, maxEntries, etc.)
There was a problem hiding this comment.
Looking at adding a bounded cache sex (maxEntries). Configurable through system property, no TTL or LRU as that seems like overkill here.
| // Look up cached policy set | ||
| let policies = CACHED_POLICY_SETS | ||
| .get(&call.preparsed_policy_set_id) | ||
| .map(|r| r.clone()); |
There was a problem hiding this comment.
Maybe this could be further optimized by using Arc<PolicySet> instead of performing a clone as isAuthorized just takes a reference to the policy set?
@mark-creamer-amazon What do you think?
There was a problem hiding this comment.
Yup, I think this is correct and performing the clone on the reference is cheaper.
| } | ||
| }; | ||
|
|
||
| let principal = match principal.parse(Some("principal")) { |
There was a problem hiding this comment.
Wondering if we can just reuse cedar_policy::ffi::is_authorized_json_str or cedar_policy::ffi::is_authorized_json_str like we do for non-stateful isAuthorized call instead of doing all the request parsing by ourselves.
cedar_policy::ffi already creates a thread local authorizer. Actually, looking at the cedar_policy::ffi implementation seems like it has some version of pre-parsed policy sets and schemas methods. Maybe we can re-use it? However, from a brief look I don't see it having any eviction mechanism though.
There was a problem hiding this comment.
cedar_policy::ffi does have preparse_policy_set, preparse_schema, and stateful_is_authorized with essentially the same structure. However, their cache is thread-local (RefCell inside thread_local!), which doesn't work for JNI: the Java thread that calls cache() is typically not the same thread that later calls isAuthorized() in a server thread pool. That's why we use a global DashMap — it's cross-thread. This was earlier feedback from a customer review.
There was a problem hiding this comment.
Estimates are that a two level cache would save ~0.3% so probably not worth it..
Cache/uncache operations (preparsePolicySet, preparseSchema, removeCachedPolicySet, removeCachedSchema) no longer route through the generic callCedarJNI JSON dispatch. Each is now a direct JNI native method on PolicySet or Schema, avoiding unnecessary JSON serialization/deserialization overhead. - Delete NativeHelpers (existed only for JSON dispatch delegation) - Strip SharedCedarInternals to just the Cleaner - Remove PreparsePolicySet/PreparseSchema/RemoveCachedPolicySet/ RemoveCachedSchema from the Rust call_cedar dispatch router - Add direct JNI entry points on PolicySet and Schema classes - Fix fully-qualified annotation/import nits in PolicySet and interface.rs Signed-off-by: Chris Simmons <simmonsc@amazon.com>
Cache refuses new entries when at capacity (default 1024) rather than growing unbounded. When cache() can't store due to capacity, cacheId stays null and authorization transparently uses the uncached path. If a cached entry is evicted and the stateful authorization path returns "not found", BasicAuthorizationEngine falls back to the uncached path instead of throwing. Configurable via system properties: -Dcedar.cache.maxPolicySets=1024 -Dcedar.cache.maxSchemas=1024 Signed-off-by: Chris Simmons <simmonsc@amazon.com>
|
|
||
| // --- Cross-product: policy size x entity size --- | ||
|
|
||
| // Small policies x {medium, large} entities |
There was a problem hiding this comment.
Was wondering at first why we're leaving out the small policies x small entities case etc, then realized that yeah, they're above!
I wonder if there's an enumerating annotation we could use to have the same benchmarked method called across the parameters, similar to TestNg's DataProvider.
I see jmh has Param, which perhaps if we enumerated over an enum of sizes as input parameters, those could key into a map whose values are our various instance variables representing the differently sized policySet and entities.
| * | ||
| * @throws AuthException if the policies fail to parse during caching. | ||
| */ | ||
| public synchronized void cache() throws AuthException { |
There was a problem hiding this comment.
We should not be throwing AuthException for cache() as AuthException is meant for:
Exception thrown when errors are encountered that prevent communication with the authorization engine.
We can introduce a new exception.
| */ | ||
| public synchronized void cache() throws AuthException { | ||
| if (cacheId != null) { | ||
| return; |
There was a problem hiding this comment.
In the current state, the user might call cache() on an already cached set and as they don't get any error or response about it being they might assume that the updated set is cached. They won't realize that it is stale until a stale auth request occurs or if they read the documentation carefully. We should either allow cache updates (I would prefer this as it is better CX and offers more flexibility) or throw in an exception if it is cached already (this will make the method non-idempotent).
For either of the two options, we can enforce idempotency by storing the cached input and comparing the current cache request with the already cached input. If it is the same, we don't do anything. If it is different, we update the cache or throw exception depending on the direction we go.
| throw new AuthException("JSON Serialization Error", e); | ||
| } catch (InternalException e) { | ||
| if (e.getMessage() != null && e.getMessage().contains("cache is full")) { | ||
| return false; |
There was a problem hiding this comment.
When cache is full, it looks like the subsequent cache calls will "fail" silently as upstream cache() method does not return any error in this case which would be confusing for the user to debug.
| }; | ||
| let policies_json: String = match env.get_string(&j_policies_json) { | ||
| Ok(s) => s.into(), | ||
| Err(_) => return, |
There was a problem hiding this comment.
This will cause silent failures. We should expose these errors.
For example:
/// public string-based JSON interface to parse a schema in Cedar's cedar-readable format
#[jni_fn("com.cedarpolicy.model.schema.Schema")]
pub fn parseCedarSchemaJni<'a>(mut env: JNIEnv<'a>, _: JClass, schema_jstr: JString<'a>) -> jvalue {
match parse_cedar_schema_internal(&mut env, schema_jstr) {
Ok(v) => v.as_jni(),
Err(e) => jni_failed(&mut env, e.as_ref()),
}
}
fn parse_json_schema_internal<'a>(
env: &mut JNIEnv<'a>,
schema_jstr: JString<'a>,
) -> Result<JValueOwned<'a>> {
if schema_jstr.is_null() {
raise_npe(env)
} else {
let schema_jstring = env.get_string(&schema_jstr)?;
let schema_string = String::from(schema_jstring);
match Schema::from_json_str(&schema_string) {
Err(e) => Err(Box::new(e)),
Ok(_) => Ok(JValueGen::Object(env.new_string("success")?.into())),
}
}
}
| let policy_set_ffi: PolicySetFFI = match serde_json::from_str(&policies_json) { | ||
| Ok(ps) => ps, | ||
| Err(e) => { | ||
| let _ = env.throw_new( |
There was a problem hiding this comment.
I believe we can reuse jni_failed method here to construct the error.
| /// Direct JNI entry point to remove a cached policy set by ID. | ||
| #[jni_fn("com.cedarpolicy.model.policy.PolicySet")] | ||
| pub fn removeCachedPolicySetJni(mut env: JNIEnv<'_>, _class: JClass<'_>, j_id: JString<'_>) { | ||
| if let Ok(id_str) = env.get_string(&j_id) { |
There was a problem hiding this comment.
Silent failure similar to one mentioned above when using env.get_string(_)
| }; | ||
|
|
||
| // Authorize using cached data | ||
| let response = |
There was a problem hiding this comment.
I'm a bit skeptical of not reusing cedar_policy::ffi::is_authorized here. We use it through cedar_policy::ffi::is_authorized_json_str for our non-stateful calls. I would prefer to keep the stateful and non-stateful isAuth calls to be as same as possible when hitting cedar_policy to avoid potential drifts.
For example, here we set the ans to:
let ans = AuthorizationAnswer::Success {
response: response.into(),
warnings: vec![],
};
However, for non-stateful requests this is constructed like:
pub fn is_authorized(call: AuthorizationCall) -> AuthorizationAnswer {
match call.parse() {
WithWarnings {
t: Ok((request, policies, entities)),
warnings,
} => AuthorizationAnswer::Success {
response: AUTHORIZER.with(|authorizer| {
authorizer
.is_authorized(&request, &policies, &entities)
.into()
}),
warnings: warnings.into_iter().map(Into::into).collect(),
},
WithWarnings {
t: Err(errors),
warnings,
} => AuthorizationAnswer::Failure {
errors: errors.into_iter().map(Into::into).collect(),
warnings: warnings.into_iter().map(Into::into).collect(),
},
}
}
Although we are constructing AuthorizationAnswer::Failure in make_auth_failure to replicate this behavior but that just adds another piece of code that we'll have to keep in sync for future and introduces a potential source of inconsistency for the same API based on whether stateful or non-stateful isAuth request was made. We'll still need to use make_auth_failure for exposing errors for pre-parsed schema/PS failures when we send the parsed resources to cedar_policy::ffi::is_authorized
There was a problem hiding this comment.
Agreed we should avoid drift. The immediate gap I see is that we're hardcoding warnings: vec![] in the stateful success path, dropping any parse warnings that would normally come from entity/context parsing with a schema present.
I think the fix is to propagate warnings from our context/entity parsing (matching how the upstream call.parse() collects them), while keeping make_auth_failure for the pre-lookup errors (policy set/schema not found). Does that match what you're looking for?
On the broader point of reusing cedar_policy::ffi::is_authorized directly — the blocker is that it takes an AuthorizationCall with an unparsed PolicySet (the FFI struct), so we'd have to re-serialize our cached cedar_policy::PolicySet back to the FFI representation just to have it re-parsed, which defeats the purpose of caching. If upstream ever exposes a variant that accepts a pre-resolved PolicySet + Schema we could delegate fully. Until then, is keeping the response construction in sync (especially warnings) sufficient?
There was a problem hiding this comment.
I think the fix is to propagate warnings from our context/entity parsing (matching how the upstream call.parse() collects them), while keeping make_auth_failure for the pre-lookup errors (policy set/schema not found). Does that match what you're looking for?
Yeah, we should do this and then we can iterate to reduce the surface for potential drifts
There was a problem hiding this comment.
Talked offline. If schema and policy sets are pre-parsed, generating the warnings here would require another parse of schema. Also, looking at stateful auth in cedar_policy::ffi does not populate warnings. So, there is some precedent. We can leave it as-is in this case.
- Add CacheException for cache() failures (AuthException was semantically wrong — it's meant for authorization engine communication errors). - cache() now re-parses and updates the Rust-side entry when called again after mutation, using the same cache ID. Supports mutate-then-recache without requiring a new object. - Throw CacheException when cache is full instead of silent failure. - Pass DEFAULT_MAX_CACHED to native setter unconditionally so the Java constant is the single source of truth. - Throw InternalException on JNI string read failures in all cache native methods instead of silently returning. - Update Javadoc to document re-caching behavior. Signed-off-by: Chris Simmons <simmonsc@amazon.com>
| } | ||
|
|
||
| @Test | ||
| public void stressCacheMany() throws AuthException, CacheException { |
There was a problem hiding this comment.
Let's also test the cache full scenario
- Make setCacheMaxPolicySets public so tests can set a small limit - Test verifies CacheException is thrown when cache is full and that uncached authorization still works as fallback Signed-off-by: Chris Simmons <simmonsc@amazon.com>
| Optional<String> policySetKey = policySet.cacheKey(); | ||
| if (policySetKey.isPresent()) { | ||
| boolean hasSchema = q.schema.isPresent(); | ||
| Optional<String> schemaKey = hasSchema ? q.schema.get().cacheKey() : Optional.empty(); | ||
| if (!hasSchema || schemaKey.isPresent()) { |
There was a problem hiding this comment.
Nit: I think we could consolidate this to one if condition and predicate:
Optional<String> policySetKey = policySet.cacheKey();
Optional<String> schemaKey = q.schema.isPresent() ? q.schema.get().cacheKey() : Optional.empty();
if (policySetKey.isPresent() && (!q.schema.isPresent() || schemaKey.isPresent())) {
...
}
| // Look up cached policy set | ||
| let policies = CACHED_POLICY_SETS | ||
| .get(&call.preparsed_policy_set_id) | ||
| .map(|r| r.clone()); |
There was a problem hiding this comment.
Yup, I think this is correct and performing the clone on the reference is cheaper.
Adds Rust-side caching of pre-parsed policies and schemas, integrated directly into PolicySet and Schema. When cached, BasicAuthorizationEngine transparently uses a fast stateful authorization path that skips re-parsing — no separate engine class or new authorization flow required.
Usage:
policySet.cache();
schema.cache();
var engine = new BasicAuthorizationEngine();
for (AuthorizationRequest req : requests) {
engine.isAuthorized(req, policySet, entities); // fast cached path
}
// Cached data is freed automatically when the object is GC'd.
API on PolicySet and Schema:
Both policy set and schema must be cached for the fast path to be used. If only one is cached, authorization falls back to the uncached path to avoid silently skipping schema validation.
Implementation:
Rust (CedarJavaFFI):
Java:
Benchmarks: