feat(normalizr): Add per-field detectCycles and entityDepth for denormalization#3840
Open
jayseo5953 wants to merge 2 commits intoreactive:masterfrom
Open
feat(normalizr): Add per-field detectCycles and entityDepth for denormalization#3840jayseo5953 wants to merge 2 commits intoreactive:masterfrom
jayseo5953 wants to merge 2 commits intoreactive:masterfrom
Conversation
) * docs: Add v0.16 migration codemod jscodeshift transform served from dataclient.io/codemods/v0.16.js handling path-to-regexp v8 syntax, useFetch() .resolved, and schema namespace to direct imports. Linked from the blog post migration guide. Made-with: Cursor * fix: bugbot * Fix schema codemod: aliased import skipped when name already exists in same declaration When a name like Entity is both directly imported and used via schema.Entity in the same import declaration, the codemod had two bugs: 1. scopeBindings included specifiers from the same import declaration (other than the schema specifier itself), causing resolveLocal to incorrectly detect a conflict and rename schema.Entity to SchemaEntity. Fixed by skipping the entire current import declaration when building scopeBindings, since those names come from the same package and won't conflict. 2. The existing-import dedup check compared imported names (e.g. 'Entity') rather than local names. When an alias was needed (Entity as SchemaEntity), finding 'Entity' already present caused the aliased specifier to be skipped, leaving SchemaEntity undefined. Fixed by checking existingLocals (s.local.name) against localName instead. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix: scope useFetch truthiness rewrites to enclosing function transformUseFetch previously collected variable names into a file-global set, then rewrote every matching identifier in any if/ternary/&& across the entire file. If two components in the same file both declared 'const data = ...' — one from useFetch() and one from an unrelated source — the non-useFetch truthiness check would be incorrectly rewritten to '!data.resolved'. Fix by walking up to the enclosing function for each useFetch() declarator and only rewriting truthiness checks within that scope. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): use optional chaining in useFetch truthiness rewrite The rewrite function transformed `if (promise)` into `if (!promise.resolved)`, but useFetch returns undefined when called with null args (conditional fetching). Accessing .resolved on undefined throws a TypeError at runtime. Use optional chaining (promise?.resolved) to guard against undefined. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): rewrite useFetch vars on both sides of LogicalExpression The LogicalExpression handler only called rewrite(p.node.left), so a useFetch variable on the right side of && or || (e.g. 'condition && promise') was never transformed. In v0.16, promise is always a truthy UsablePromise object when args are non-null, so untransformed right-side checks silently became no-ops. Now both left and right sides are rewritten, and || expressions are handled alongside &&. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): preserve schema import when bare identifier references exist The transformSchemaImports function was unconditionally removing the schema import specifier after rewriting schema.X member expressions, but standalone references to 'schema' as a bare identifier (destructuring, function arguments, typeof) were left unrewritten, causing ReferenceError at runtime. Now checks for remaining bare references before removing the import. If any exist, the schema specifier is kept while still adding the individual named imports for the rewritten member expressions. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix: bugbot * fix(codemod): skip schema.Object and schema.Array in v0.16 transform Object and Array are type-only exports from @data-client/endpoint (export type { Array, Object }), not value exports. The codemod was transforming schema.Object/schema.Array into direct imports like 'import { Object as SchemaObject }', which would fail in value positions (e.g. new SchemaObject(...)). These members must remain accessed via the schema namespace. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * Fix transformUseFetch to only apply when useFetch is imported from @data-client transformUseFetch previously matched any call named useFetch without verifying it was imported from a @data-client package. Since useFetch is a common hook name exported by other libraries (e.g. use-http), running this codemod on codebases using a different useFetch would silently corrupt truthiness checks. Added an import-source guard that checks for a useFetch import from any @data-client/* package before applying the transformation, consistent with how transformSchemaImports gates on DATA_CLIENT_PACKAGES. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): skip schema.X rewrites where schema is shadowed by local binding transformSchemaImports was replacing ALL schema.X member expressions file-wide without checking whether 'schema' at each usage site actually referred to the import or to a local variable/parameter that shadows it. For example, a function parameter named 'schema' (e.g., function validate(schema) { return schema.isValid; }) in the same file as import { schema } from '@data-client/endpoint' would have schema.isValid incorrectly replaced with a bare isValid identifier. Add isShadowed() helper that walks up the AST from each schema.X node to check for shadowing bindings in enclosing scopes (function params, catch clauses, variable declarations, for-loop bindings). Skip the replacement when the identifier is shadowed. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): preserve boolean semantics for useFetch conditional fetching The rewrite function transformed: - `promise` → `!promise?.resolved` - `!promise` → `promise?.resolved` This inverts boolean semantics when useFetch returns undefined (null args for conditional fetching). `undefined?.resolved` evaluates to `undefined` via optional chaining, so `!undefined` is `true` — but the original `if (promise)` with `undefined` was `false`. Fix by using strict equality against `false`: - `promise` → `promise?.resolved === false` - `!promise` → `promise?.resolved !== false` This preserves the original semantics for all three states: in-flight promise (resolved=false), resolved promise (resolved=true), and disabled fetch (undefined). Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): detect top-level var/function/class bindings in scopeBindings scopeBindings only collected local names from import declarations, missing top-level variable, function, and class declarations. This caused resolveLocal to miss naming conflicts (e.g. const Union = someValue), producing invalid code with duplicate bindings. Now collects bindings from: - Top-level VariableDeclarations (including destructured patterns) - Top-level FunctionDeclarations - Top-level ClassDeclarations - Exported variants of all the above Co-authored-by: Nathaniel Tucker <me@ntucker.me> * fix(codemod): detect FunctionDeclaration/ClassDeclaration shadowing in isShadowed The isShadowed function only checked VariableDeclaration statements when scanning BlockStatement/Program nodes. This missed FunctionDeclaration and ClassDeclaration names that also create bindings in block scope. When a local function or class with the same name as the import alias (e.g. 'schema') existed in an enclosing block, isShadowed incorrectly returned false, causing the codemod to transform references to those local bindings. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * Fix transformUseFetch rewriting shadowed variables in nested callbacks transformUseFetch searched the entire enclosing function for truthiness checks on the useFetch variable name, but never checked whether that name was shadowed by a parameter or local variable in a nested function or callback. This caused incorrect rewrites like adding .resolved to a callback parameter that happened to share the same name. Fix: reuse the existing isShadowed() helper (already used by transformSchemaImports) with an optional stopNode parameter to limit the shadow walk to scopes between the found expression and the function where useFetch was declared. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * Guard transformPaths with @data-client import check transformPaths was transforming every StringLiteral with a 'path' property key in the entire file, regardless of whether the file uses @data-client. This could silently corrupt path strings from unrelated libraries (React Router, Express, etc.) like /:param? → {/:param}. Add the same DATA_CLIENT_PACKAGES import guard that transformUseFetch already uses, so transformPaths only runs on files that actually import from @data-client/endpoint or @data-client/rest. Co-authored-by: Nathaniel Tucker <me@ntucker.me> * Fix resolveLocal collision: verify 'Schema'+name is also free of conflicts resolveLocal prefixed a name with 'Schema' when it collided with a JS global or scope binding, but never verified the resulting identifier was itself free of collisions. If the user had an existing binding named e.g. SchemaUnion, the codemod would silently produce a conflicting identifier. Now the function loops, prepending '_' to the candidate until it finds an identifier that doesn't collide with JS_GLOBALS or scopeBindings. Co-authored-by: Nathaniel Tucker <me@ntucker.me> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
|
…malization
Adds two per-field schema options for controlling denormalization depth:
- `detectCycles: true` — stops when the same entity type appears twice
in the ancestor path. Targets cross-type bidirectional cycles precisely.
- `entityDepth: N` — relative depth limit from the field. For self-referential
relationships (parent/children) where the type repeats legitimately.
Usage:
static schema = {
siteOrganizations: { schema: [SiteOrganization], detectCycles: true },
children: { schema: [Organization], entityDepth: 3 },
};
Depth-limited entities read from cache (returning fully resolved versions
if available) but never write to cache, preventing cache poisoning.
Works alongside existing maxEntityDepth as a global safety net.
Fixes reactive#3822
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
d914676 to
d67e250
Compare
|
@jayseo5953 is attempting to deploy a commit to the data-client Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #3822
Problem
data-client eagerly denormalizes all reachable entities depth-first. With bidirectional relationships (Department → Building → Department → Building), each hop visits a different primary key, so same-PK cycle detection never fires. The traversal recurses through the entire entity graph until the JS call stack overflows.
The existing
maxEntityDepthon Entity (PR #3823, #3834) mitigates this with a global depth counter, but has issues:maxEntityDepthdoesn't control nesting depth — it controls at what global depth the entity stops resolving. The docs describe it as "limits entity nesting depth," but it checks the global depth counter, not entity-specific nesting. For example:What a user expects: Department can nest 10 levels deep (parent → parent → ... 10 times).
What actually happens: Department won't resolve if the global depth counter is ≥ 10 — regardless of how many of those hops were Department entities. If Department is first encountered at global depth 8 (nested inside Building → Room → Document → ...), it only gets 2 more hops before the limit fires. The same Department encountered at global depth 2 gets 8 hops. The effective nesting depth is unpredictable.
Additional issues:
depthLimitEntitybypass the cache, creating duplicate shallow copies — with real production data, this produces 3.8M uncached copies for 500 entitiesThis PR removes
maxEntityDepthfrom Entity and replaces it with per-field controls that accurately describe their behavior. The hardcoded globalMAX_ENTITY_DEPTH(64) remains as a safety net.Solution
Two per-field schema options that target the actual problems:
1.
detectCycles: true— schema-type cycle detectionStops when the same entity type appears twice in the current ancestor path. This targets the root cause — bidirectional back-references creating infinite traversal loops.
Why this is better than a global depth counter for bidirectional cycles:
Department → BuildingandBuilding → Departmentboth resolve, but the cyclic loopDepartment → Building → Department → Building → ...is stoppeddetectCyclesdenormalizes 3,000 entities in 2ms vs 17,285ms withmaxEntityDepth: 122.
entityDepth: N— relative depth limit for self-referential fieldsFor self-referential relationships (parent/children hierarchies) where the entity type repeats legitimately.
detectCycleswould stop these at 1 hop (too aggressive).entityDepthprovides a relative counter from where the field config is set.Unlike
maxEntityDepth,entityDepthis relative — "allow N hops from this field" means the same thing regardless of where the entity sits in the overall tree.Limitation:
entityDepthis designed for self-referential fields on the same entity. When multiple entities in a traversal chain useentityDepthon cross-type relationships, the outermost one controls the subtree. UsedetectCyclesfor cross-type relationships.3. Cache-safe depth limiting
Depth-limited entities read from cache (returning fully resolved versions if available) but don't write to cache. This prevents cache poisoning where an entity first encountered deep in one subtree gets cached as a shallow copy and served to top-level consumers that need the fully resolved version.
Usage
Both options can be combined on the same field. Whichever triggers first wins.
Benchmark
Merged store after loading multiple pages: 47,608 entities, 19,504 in one connected component.
maxEntityDepth: 12detectCyclesCompatibility
MAX_ENTITY_DEPTH = 64) remains as a fallbackdetectCyclesandentityDepthfire before the global safety net check{ schema: S, detectCycles?: boolean, entityDepth?: number }resolves toDenormalize<S>Future consideration
The global safety net (
MAX_ENTITY_DEPTH = 64) could be made configurable at the application level (e.g., via DataProvider) to allow different apps to set their own fallback limit without changing the source.