-
Notifications
You must be signed in to change notification settings - Fork 152
Implement auto-registering operators for live queries #944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement auto-registering operators for live queries #944
Conversation
This enables tree-shaking by having each operator register its own evaluator when imported, rather than relying on a monolithic switch statement. Key changes: - Add registry.ts with registerOperator/getOperatorEvaluator APIs - Create individual operator files (eq, gt, gte, lt, lte, and, or, not, in, like, ilike, upper, lower, length, concat, coalesce, add, subtract, multiply, divide, isNull, isUndefined) - Each operator file bundles builder function + evaluator + registration - Modify evaluators.ts to use registry lookup instead of switch - Update query/index.ts to export from new operator modules - Export compileExpressionInternal for operator modules to use The pattern: importing an operator causes its file to execute, which calls registerOperator, adding it to the registry. By query compile time, all operators in use are already registered.
🦋 Changeset detectedLatest commit: 0d37df5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 12 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
More templates
@tanstack/angular-db
@tanstack/db
@tanstack/db-ivm
@tanstack/electric-db-collection
@tanstack/offline-transactions
@tanstack/powersync-db-collection
@tanstack/query-db-collection
@tanstack/react-db
@tanstack/rxdb-db-collection
@tanstack/solid-db
@tanstack/svelte-db
@tanstack/trailbase-db-collection
@tanstack/vue-db
commit: |
|
Size Change: +7.24 kB (+8.03%) 🔍 Total Size: 97.3 kB
ℹ️ View Unchanged
|
|
Size Change: 0 B Total Size: 3.35 kB ℹ️ View Unchanged
|
Export the registry API so users can create and register their own custom operators. Also adds a changeset for this patch.
- Export registerOperator, EvaluatorFactory, and CompiledExpression types - Add comprehensive tests for custom operator registration - Tests cover between, startsWith, isEmpty, modulo operators - Demonstrates full pattern: builder function + evaluator + registration
Extend the auto-registration pattern to aggregates (sum, count, avg, min, max): - Create aggregate-registry.ts with registerAggregate/getAggregateConfig - Split each aggregate into its own module with builder + auto-registration - Update group-by.ts to use registry lookup instead of switch statement - Export registerAggregate and types from public API - Add tests demonstrating custom aggregate registration This enables tree-shaking for aggregates and allows users to register custom aggregates like 'product', 'variance', etc.
- Remove eager imports from evaluators.ts and group-by.ts - Make functions.ts re-export from operator/aggregate modules - Add shared types.ts for type helpers preserving nullability - Update test files to import operators for direct IR testing Now operators/aggregates are only loaded when user imports them, enabling true tree-shaking. The compiler no longer pre-loads all evaluators - they register when their builder functions are imported.
Replace global registry pattern with embedded factories for true tree-shaking: - Func nodes now carry their evaluator factory directly - Aggregate nodes now carry their config (factory + valueTransform) directly - Remove registry.ts and aggregate-registry.ts files - Update all operators to pass factory as 3rd argument to Func - Update all aggregates to pass config as 3rd argument to Aggregate - Update internal code (optimizer, predicate-utils, expressions) to preserve factories when transforming Func nodes - Add array overloads to and() and or() for internal usage - Update tests to use builder functions instead of creating IR directly This design eliminates the need for side-effect imports and ensures only imported operators/aggregates are bundled.
3440d6d to
57b0eeb
Compare
…operators-016i3ACKCCt2WbkPRApMpqju # Conflicts: # packages/db/src/query/builder/functions.ts # packages/db/src/query/compiler/evaluators.ts # packages/db/src/query/compiler/group-by.ts # packages/db/src/query/index.ts # packages/db/src/query/ir.ts # packages/db/src/query/optimizer.ts # packages/db/src/query/predicate-utils.ts # packages/db/tests/collection-change-events.test.ts # packages/db/tests/query/compiler/basic.test.ts # packages/db/tests/query/compiler/evaluators.test.ts # packages/db/tests/query/compiler/select.test.ts
|
Reviewed with the help of @kevin-dp and Opus 4.5: I think the proposed (below) PR 944 Review: Comprehensive AnalysisSummaryThis PR refactors the operator and aggregate system to embed evaluator factories directly in the IR nodes ( What's Good ✅1. Self-Contained IR Nodes
2. Factory Preservation in IR Transforms
3. 3-Valued Logic Consistency 4. New Aggregates 5. Tests Issues to Address 🔧1. Duplicated
|
| Bundler | Pattern | Tree-shakes unused exports? |
|---|---|---|
| Rollup | export const gt = defineOperator(...) |
✅ Yes |
| Rollup | export function gt(...) { ... } |
✅ Yes |
| esbuild | export const gt = defineOperator(...) |
❌ No |
| esbuild | export function gt(...) { ... } |
✅ Yes |
| esbuild | export const gt = /*#__PURE__*/ (()=>...)() |
✅ Yes |
Analysis
- Rollup (used by Vite production builds, SvelteKit, etc.) correctly tree-shakes both patterns.
- esbuild (used by Vite dev builds) is conservative about top-level function calls like
defineOperator(...). It cannot prove they're side-effect-free, so it keeps them. - The
export functionpattern works universally because there's no top-level computation.
Recommendation
Built-in operators should use export function style (which the current PR already does). This ensures reliable tree-shaking across all bundlers.
Do NOT use defineOperator for built-in operators internally. While it works with Rollup, it can cause issues with esbuild-based bundlers. Reserve defineOperator as a public API for users creating custom operators—they'll typically define only the operators they need, so tree-shaking of their custom operators is less critical.
Design Alternatives Considered
Class-Based Approach
We explored making Func abstract and subclassable:
abstract class Operator<T> extends BaseExpression<T> {
abstract readonly name: string
abstract evaluate(compiledArgs: Array<CompiledExpression>): CompiledExpression
}
class GtOperator extends Operator<boolean> {
readonly name = 'gt'
evaluate([a, b]: Array<CompiledExpression>) {
return (data) => { /* ... */ }
}
}
export const gt = (left: any, right: any) =>
new GtOperator([toExpression(left), toExpression(right)])Pros:
- Each operator is a distinct type (
instanceof GtOperator) - More traditional OOP pattern
- Could have convenience base classes (
BinaryComparison,UnaryTransform, etc.)
Cons:
- More boilerplate per operator (class + export)
- Requires maintaining parallel class hierarchy
- Less idiomatic for a functional query builder API
Decision: Stay with the functional approach. It's more aligned with modern JS patterns, results in less code, and the defineOperator helper provides a clean public API without requiring users to understand class hierarchies.
Checklist for Merge
- Extract duplicated
isUnknownto shared module - Use shared types from
types.tsinstead of local definitions - Update outdated "auto-registers" comment in
functions.ts - Consider adding factory generator helpers (
comparison,booleanOp,transform,numeric) - Add
defineOperatoranddefineAggregateto public API - Document the custom operator/aggregate pattern in README or docs
- Ensure built-ins use
export functionpattern (already done ✅)
Verdict
Approve with minor changes. The core architecture is solid. The factory-on-node approach is the right design. The suggestions above are improvements to reduce duplication and formalize the public API, but the PR is fundamentally correct and can be merged after addressing the duplication issues.
|
@KyleAMathews I had my Cursor session open after the review, and have asked it to try implementing these changes. Will open in a bit with a PR stacked on this, you can decide if you want to fold it in. |
KyleAMathews
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review: PR #944
Overall Assessment: Approve ✅
This is a beautifully architected refactoring that brings the benefits of tree-shaking and extensibility to TanStack DB operators. Like the Articles of Faith encapsulating core principles, each operator file now encapsulates its complete implementation - builder function and evaluator factory together.
Summary
The PR refactors operators and aggregates from a monolithic switch statement to self-contained modules that embed their evaluator factories directly in IR nodes. This enables:
- True tree-shaking (only import what you use)
- Custom operators without modifying core code
- Co-located builder + evaluator code
Architecture Highlights
Before:
// evaluators.ts - 300+ line switch statement
switch (func.name) {
case 'eq': { /* evaluator logic */ }
case 'gt': { /* evaluator logic */ }
// ... dozens more
}After:
// packages/db/src/query/builder/operators/eq.ts
const eqEvaluatorFactory: EvaluatorFactory = (compiledArgs, isSingleRow) => {
// Evaluator logic lives with builder
}
export function eq(left, right) {
return new Func('eq', [...], eqEvaluatorFactory)
}Strengths
-
Significant bundle size reduction: Users who only use
eqandanddon't pay forlike,ilike,concat, etc. -
Excellent extensibility story: Custom operators are now trivial:
function between(value, min, max) { return new Func('between', [...], betweenFactory) }
-
New aggregates: Added
collect,minStr,maxStr- useful additions especially for timestamp handling. -
Co-location: Each operator file contains everything needed to understand that operator.
-
Comprehensive tests:
custom-operators.test.tsandcustom-aggregates.test.tsdemonstrate the extensibility patterns.
Technical Notes
-
valueTransformoptions: The aggregate configs supportnumeric,numericOrDate, andrawtransforms - good flexibility. -
Three-valued logic preserved: All operators maintain SQL-style null handling (
null && true === null, etc.) -
Backward compatibility: Barrel exports maintain the existing import paths.
Minor Suggestions
-
types.ts:67-119: The
ExtractTypeandAggregateReturnTypeutility types are well-designed. Consider adding JSDoc explaining the type inference logic. -
Documentation addition: The live-queries.md addition for
minStr,maxStr,collectis good. Consider adding a "Custom Operators" guide showing the extensibility pattern. -
Consider a validation pattern: For custom operators, it might be helpful to have a
validateFactoryhelper that checks the factory signature at development time.
Question
For the custom operator/aggregate patterns, is there a recommended way to type-check that the factory matches the expected signature at compile time? The EvaluatorFactory type ensures runtime compatibility, but compile-time validation would be even better.
Excellent refactoring that makes the codebase more modular and tree-shakable! 🌳✨
Summary
Refactors query operators and aggregates to embed their evaluator factories directly in IR nodes, eliminating the global registry. This enables true tree-shaking (only import what you use) and allows users to create custom operators/aggregates without modifying core code.
Root Cause
The previous architecture used a global registry in
evaluators.tswith a massive switch statement to look up operator implementations by name. This had several problems:Approach
Moved each operator's evaluator factory into its own file alongside its builder function. The factory is now embedded directly in the
FuncorAggregateIR node at construction time.Before:
After:
The IR nodes now carry their own factory:
Key Invariants
Func/Aggregatenode must have its factory attached at constructionNon-goals
Trade-offs
Alternative considered: Keep registry but make it dynamically populated
Alternative considered: Pass factory name as generic parameter
Chosen approach embeds the factory reference directly in the IR node, which:
Verification
pnpm test pnpm run lint:checkKey test files demonstrating the patterns:
packages/db/tests/query/compiler/custom-operators.test.ts- Custombetween,startsWithoperatorspackages/db/tests/query/compiler/custom-aggregates.test.ts- Customproduct,varianceaggregatesFiles Changed
New operator files (
packages/db/src/query/builder/operators/):eq.ts,gt.ts,gte.ts,lt.ts,lte.ts- Comparison operatorsand.ts,or.ts,not.ts- Boolean operatorsadd.ts,subtract.ts,multiply.ts,divide.ts- Math operatorslike.ts,ilike.ts,upper.ts,lower.ts,length.ts,concat.ts,coalesce.ts- String operatorsin.ts,isNull.ts,isUndefined.ts- Utility operatorstypes.ts,index.ts- Shared types and barrel exportNew aggregate files (
packages/db/src/query/builder/aggregates/):sum.ts,count.ts,avg.ts,min.ts,max.ts- Standard aggregatescollect.ts,minStr.ts,maxStr.ts- New aggregatesindex.ts- Barrel exportModified core files:
ir.ts- AddedEvaluatorFactory,AggregateConfigtypes and optional factory fields to IR nodesevaluators.ts- Removed 300+ lines of switch cases, now delegates to embedded factoriesfunctions.ts- Simplified to import from operator modulesgroup-by.ts- Updated to use embedded aggregate configs🤖 Generated with Claude Code