Skip to content

Add policy and schema caching via cache()/cacheKey() API#350

Merged
muditchaudhary merged 5 commits intocedar-policy:mainfrom
geekphilosophy:integrated-caching
May 5, 2026
Merged

Add policy and schema caching via cache()/cacheKey() API#350
muditchaudhary merged 5 commits intocedar-policy:mainfrom
geekphilosophy:integrated-caching

Conversation

@geekphilosophy
Copy link
Copy Markdown
Contributor

Adds Rust-side caching of pre-parsed policies and schemas, integrated directly into PolicySet and Schema. When cached, BasicAuthorizationEngine transparently uses a fast stateful authorization path that skips re-parsing — no separate engine class or new authorization flow required.

Usage:

policySet.cache();
schema.cache();

var engine = new BasicAuthorizationEngine();
for (AuthorizationRequest req : requests) {
engine.isAuthorized(req, policySet, entities); // fast cached path
}
// Cached data is freed automatically when the object is GC'd.

API on PolicySet and Schema:

  • cache(): pre-parse and cache on the Rust side. One-way, idempotent. To use different policies/schema, create a new instance.
  • cacheKey(): returns Optional cache ID, empty if not cached.

Both policy set and schema must be cached for the fast path to be used. If only one is cached, authorization falls back to the uncached path to avoid silently skipping schema validation.

Implementation:

Rust (CedarJavaFFI):

  • Shared DashMap cache for parsed PolicySets and Schemas, accessible from any thread. Direct JNI entry points for cache removal (no JSON dispatch overhead).
  • Stateful authorization operation that looks up cached policies/schema by ID and authorizes against them.

Java:

  • PolicySet and Schema gain cache()/cacheKey() methods.
  • BasicAuthorizationEngine.isAuthorized() checks for cached inputs and dispatches to the stateful path transparently.
  • GC-based cleanup via java.lang.ref.Cleaner frees Rust cache entries.
  • SharedCedarInternals centralizes the Cleaner instance and JNI access. NativeHelpers remains package-private.

Benchmarks:

  • Full cross-product matrix: 4 policy sizes x 3 entity sizes, cached and uncached. Run via: ./gradlew jmh
  • Caching avoids re-parsing policies and schemas on every authorization call. The benefit scales with policy/schema complexity — from ~2x for trivial policies to over 20x for large policy sets with schemas

Adds Rust-side caching of pre-parsed policies and schemas, integrated
directly into PolicySet and Schema. When cached, BasicAuthorizationEngine
transparently uses a fast stateful authorization path that skips
re-parsing — no separate engine class or new authorization flow required.

Usage:

  policySet.cache();
  schema.cache();

  var engine = new BasicAuthorizationEngine();
  for (AuthorizationRequest req : requests) {
      engine.isAuthorized(req, policySet, entities);  // fast cached path
  }
  // Cached data is freed automatically when the object is GC'd.

API on PolicySet and Schema:
- cache(): pre-parse and cache on the Rust side. One-way, idempotent.
  To use different policies/schema, create a new instance.
- cacheKey(): returns Optional<String> cache ID, empty if not cached.

Both policy set and schema must be cached for the fast path to be used.
If only one is cached, authorization falls back to the uncached path
to avoid silently skipping schema validation.

Implementation:

Rust (CedarJavaFFI):
- Shared DashMap cache for parsed PolicySets and Schemas, accessible
  from any thread. Direct JNI entry points for cache removal (no JSON
  dispatch overhead).
- Stateful authorization operation that looks up cached policies/schema
  by ID and authorizes against them.

Java:
- PolicySet and Schema gain cache()/cacheKey() methods.
- BasicAuthorizationEngine.isAuthorized() checks for cached inputs and
  dispatches to the stateful path transparently.
- GC-based cleanup via java.lang.ref.Cleaner frees Rust cache entries.
- SharedCedarInternals centralizes the Cleaner instance and JNI access.
  NativeHelpers remains package-private.

Benchmarks:
- Full cross-product matrix: 4 policy sizes x 3 entity sizes, cached
  and uncached. Run via: ./gradlew jmh

Signed-off-by: Chris Simmons <simmonsc@amazon.com>
Comment thread CedarJava/build.gradle
String input = objectWriter().writeValueAsString(
new PreparsePolicySetRequest(id, this));
String response = SharedCedarInternals.callCedarJNI("PreparsePolicySet", input);
JsonNode node = new com.fasterxml.jackson.databind.ObjectMapper().readTree(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Replace fully-qualified inline usage of ObjectMapper with import.

  import com.fasterxml.jackson.databind.ObjectMapper;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing.

Comment thread CedarJava/src/main/java/com/cedarpolicy/model/policy/PolicySet.java Outdated
Comment thread CedarJavaFFI/src/interface.rs Outdated
static AUTHORIZER: Authorizer = Authorizer::new();
}

static CACHED_POLICY_SETS: LazyLock<DashMap<String, PolicySet>> = LazyLock::new(DashMap::new);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have some cache eviction policy in place. Otherwise, if someone caches a lot of long-lived policy sets, we run into a risk of running out of memory. It won't be a Java OOM "graceful" error but probably a process killed error if CACHED_POLICY_SETS or CACHED_SCHEMAS grow too much. Unlikely for most use-cases but we should have it in place. Maybe give user some control over the eviction policy (e.g., TTL, maxEntries, etc.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at adding a bounded cache sex (maxEntries). Configurable through system property, no TTL or LRU as that seems like overkill here.

// Look up cached policy set
let policies = CACHED_POLICY_SETS
.get(&call.preparsed_policy_set_id)
.map(|r| r.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be further optimized by using Arc<PolicySet> instead of performing a clone as isAuthorized just takes a reference to the policy set?

@mark-creamer-amazon What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think this is correct and performing the clone on the reference is cheaper.

Comment thread CedarJavaFFI/src/interface.rs Outdated
}
};

let principal = match principal.parse(Some("principal")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we can just reuse cedar_policy::ffi::is_authorized_json_str or cedar_policy::ffi::is_authorized_json_str like we do for non-stateful isAuthorized call instead of doing all the request parsing by ourselves.

cedar_policy::ffi already creates a thread local authorizer. Actually, looking at the cedar_policy::ffi implementation seems like it has some version of pre-parsed policy sets and schemas methods. Maybe we can re-use it? However, from a brief look I don't see it having any eviction mechanism though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cedar_policy::ffi does have preparse_policy_set, preparse_schema, and stateful_is_authorized with essentially the same structure. However, their cache is thread-local (RefCell inside thread_local!), which doesn't work for JNI: the Java thread that calls cache() is typically not the same thread that later calls isAuthorized() in a server thread pool. That's why we use a global DashMap — it's cross-thread. This was earlier feedback from a customer review.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estimates are that a two level cache would save ~0.3% so probably not worth it..

Cache/uncache operations (preparsePolicySet, preparseSchema,
removeCachedPolicySet, removeCachedSchema) no longer route through
the generic callCedarJNI JSON dispatch. Each is now a direct JNI
native method on PolicySet or Schema, avoiding unnecessary JSON
serialization/deserialization overhead.

- Delete NativeHelpers (existed only for JSON dispatch delegation)
- Strip SharedCedarInternals to just the Cleaner
- Remove PreparsePolicySet/PreparseSchema/RemoveCachedPolicySet/
  RemoveCachedSchema from the Rust call_cedar dispatch router
- Add direct JNI entry points on PolicySet and Schema classes
- Fix fully-qualified annotation/import nits in PolicySet and
  interface.rs

Signed-off-by: Chris Simmons <simmonsc@amazon.com>
Comment thread CedarJava/src/main/java/com/cedarpolicy/model/policy/PolicySet.java
Cache refuses new entries when at capacity (default 1024) rather than
growing unbounded. When cache() can't store due to capacity, cacheId
stays null and authorization transparently uses the uncached path.

If a cached entry is evicted and the stateful authorization path
returns "not found", BasicAuthorizationEngine falls back to the
uncached path instead of throwing.

Configurable via system properties:
  -Dcedar.cache.maxPolicySets=1024
  -Dcedar.cache.maxSchemas=1024

Signed-off-by: Chris Simmons <simmonsc@amazon.com>

// --- Cross-product: policy size x entity size ---

// Small policies x {medium, large} entities
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering at first why we're leaving out the small policies x small entities case etc, then realized that yeah, they're above!

I wonder if there's an enumerating annotation we could use to have the same benchmarked method called across the parameters, similar to TestNg's DataProvider.

I see jmh has Param, which perhaps if we enumerated over an enum of sizes as input parameters, those could key into a map whose values are our various instance variables representing the differently sized policySet and entities.

Comment thread CedarJava/src/main/java/com/cedarpolicy/model/policy/PolicySet.java
*
* @throws AuthException if the policies fail to parse during caching.
*/
public synchronized void cache() throws AuthException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not be throwing AuthException for cache() as AuthException is meant for:

Exception thrown when errors are encountered that prevent communication with the authorization engine.

We can introduce a new exception.

*/
public synchronized void cache() throws AuthException {
if (cacheId != null) {
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current state, the user might call cache() on an already cached set and as they don't get any error or response about it being they might assume that the updated set is cached. They won't realize that it is stale until a stale auth request occurs or if they read the documentation carefully. We should either allow cache updates (I would prefer this as it is better CX and offers more flexibility) or throw in an exception if it is cached already (this will make the method non-idempotent).

For either of the two options, we can enforce idempotency by storing the cached input and comparing the current cache request with the already cached input. If it is the same, we don't do anything. If it is different, we update the cache or throw exception depending on the direction we go.

throw new AuthException("JSON Serialization Error", e);
} catch (InternalException e) {
if (e.getMessage() != null && e.getMessage().contains("cache is full")) {
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cache is full, it looks like the subsequent cache calls will "fail" silently as upstream cache() method does not return any error in this case which would be confusing for the user to debug.

Comment thread CedarJavaFFI/src/interface.rs Outdated
};
let policies_json: String = match env.get_string(&j_policies_json) {
Ok(s) => s.into(),
Err(_) => return,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will cause silent failures. We should expose these errors.

For example:

/// public string-based JSON interface to parse a schema in Cedar's cedar-readable format
#[jni_fn("com.cedarpolicy.model.schema.Schema")]
pub fn parseCedarSchemaJni<'a>(mut env: JNIEnv<'a>, _: JClass, schema_jstr: JString<'a>) -> jvalue {
    match parse_cedar_schema_internal(&mut env, schema_jstr) {
        Ok(v) => v.as_jni(),
        Err(e) => jni_failed(&mut env, e.as_ref()),
    }
}

fn parse_json_schema_internal<'a>(
    env: &mut JNIEnv<'a>,
    schema_jstr: JString<'a>,
) -> Result<JValueOwned<'a>> {
    if schema_jstr.is_null() {
        raise_npe(env)
    } else {
        let schema_jstring = env.get_string(&schema_jstr)?;
        let schema_string = String::from(schema_jstring);
        match Schema::from_json_str(&schema_string) {
            Err(e) => Err(Box::new(e)),
            Ok(_) => Ok(JValueGen::Object(env.new_string("success")?.into())),
        }
    }
}

let policy_set_ffi: PolicySetFFI = match serde_json::from_str(&policies_json) {
Ok(ps) => ps,
Err(e) => {
let _ = env.throw_new(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can reuse jni_failed method here to construct the error.

Comment thread CedarJavaFFI/src/interface.rs Outdated
/// Direct JNI entry point to remove a cached policy set by ID.
#[jni_fn("com.cedarpolicy.model.policy.PolicySet")]
pub fn removeCachedPolicySetJni(mut env: JNIEnv<'_>, _class: JClass<'_>, j_id: JString<'_>) {
if let Ok(id_str) = env.get_string(&j_id) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent failure similar to one mentioned above when using env.get_string(_)

};

// Authorize using cached data
let response =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit skeptical of not reusing cedar_policy::ffi::is_authorized here. We use it through cedar_policy::ffi::is_authorized_json_str for our non-stateful calls. I would prefer to keep the stateful and non-stateful isAuth calls to be as same as possible when hitting cedar_policy to avoid potential drifts.

For example, here we set the ans to:

    let ans = AuthorizationAnswer::Success {
        response: response.into(),
        warnings: vec![],
    };

However, for non-stateful requests this is constructed like:

pub fn is_authorized(call: AuthorizationCall) -> AuthorizationAnswer {
    match call.parse() {
        WithWarnings {
            t: Ok((request, policies, entities)),
            warnings,
        } => AuthorizationAnswer::Success {
            response: AUTHORIZER.with(|authorizer| {
                authorizer
                    .is_authorized(&request, &policies, &entities)
                    .into()
            }),
            warnings: warnings.into_iter().map(Into::into).collect(),
        },
        WithWarnings {
            t: Err(errors),
            warnings,
        } => AuthorizationAnswer::Failure {
            errors: errors.into_iter().map(Into::into).collect(),
            warnings: warnings.into_iter().map(Into::into).collect(),
        },
    }
}

Although we are constructing AuthorizationAnswer::Failure in make_auth_failure to replicate this behavior but that just adds another piece of code that we'll have to keep in sync for future and introduces a potential source of inconsistency for the same API based on whether stateful or non-stateful isAuth request was made. We'll still need to use make_auth_failure for exposing errors for pre-parsed schema/PS failures when we send the parsed resources to cedar_policy::ffi::is_authorized

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed we should avoid drift. The immediate gap I see is that we're hardcoding warnings: vec![] in the stateful success path, dropping any parse warnings that would normally come from entity/context parsing with a schema present.

I think the fix is to propagate warnings from our context/entity parsing (matching how the upstream call.parse() collects them), while keeping make_auth_failure for the pre-lookup errors (policy set/schema not found). Does that match what you're looking for?

On the broader point of reusing cedar_policy::ffi::is_authorized directly — the blocker is that it takes an AuthorizationCall with an unparsed PolicySet (the FFI struct), so we'd have to re-serialize our cached cedar_policy::PolicySet back to the FFI representation just to have it re-parsed, which defeats the purpose of caching. If upstream ever exposes a variant that accepts a pre-resolved PolicySet + Schema we could delegate fully. Until then, is keeping the response construction in sync (especially warnings) sufficient?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix is to propagate warnings from our context/entity parsing (matching how the upstream call.parse() collects them), while keeping make_auth_failure for the pre-lookup errors (policy set/schema not found). Does that match what you're looking for?

Yeah, we should do this and then we can iterate to reduce the surface for potential drifts

Copy link
Copy Markdown
Contributor

@muditchaudhary muditchaudhary May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline. If schema and policy sets are pre-parsed, generating the warnings here would require another parse of schema. Also, looking at stateful auth in cedar_policy::ffi does not populate warnings. So, there is some precedent. We can leave it as-is in this case.

- Add CacheException for cache() failures (AuthException was semantically
  wrong — it's meant for authorization engine communication errors).
- cache() now re-parses and updates the Rust-side entry when called again
  after mutation, using the same cache ID. Supports mutate-then-recache
  without requiring a new object.
- Throw CacheException when cache is full instead of silent failure.
- Pass DEFAULT_MAX_CACHED to native setter unconditionally so the Java
  constant is the single source of truth.
- Throw InternalException on JNI string read failures in all cache
  native methods instead of silently returning.
- Update Javadoc to document re-caching behavior.

Signed-off-by: Chris Simmons <simmonsc@amazon.com>
}

@Test
public void stressCacheMany() throws AuthException, CacheException {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also test the cache full scenario

- Make setCacheMaxPolicySets public so tests can set a small limit
- Test verifies CacheException is thrown when cache is full and that
  uncached authorization still works as fallback

Signed-off-by: Chris Simmons <simmonsc@amazon.com>
Comment on lines +74 to +78
Optional<String> policySetKey = policySet.cacheKey();
if (policySetKey.isPresent()) {
boolean hasSchema = q.schema.isPresent();
Optional<String> schemaKey = hasSchema ? q.schema.get().cacheKey() : Optional.empty();
if (!hasSchema || schemaKey.isPresent()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think we could consolidate this to one if condition and predicate:

Optional<String> policySetKey = policySet.cacheKey();
Optional<String> schemaKey = q.schema.isPresent() ? q.schema.get().cacheKey() : Optional.empty();

if (policySetKey.isPresent() && (!q.schema.isPresent() || schemaKey.isPresent())) {
    ...
}

// Look up cached policy set
let policies = CACHED_POLICY_SETS
.get(&call.preparsed_policy_set_id)
.map(|r| r.clone());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I think this is correct and performing the clone on the reference is cheaper.

@muditchaudhary muditchaudhary merged commit 0f9da9b into cedar-policy:main May 5, 2026
4 checks passed
@geekphilosophy geekphilosophy deleted the integrated-caching branch May 5, 2026 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants