Skip to content

Conversation

@KyleAMathews
Copy link
Collaborator

@KyleAMathews KyleAMathews commented Dec 1, 2025

Summary

Refactors query operators and aggregates to embed their evaluator factories directly in IR nodes, eliminating the global registry. This enables true tree-shaking (only import what you use) and allows users to create custom operators/aggregates without modifying core code.

Root Cause

The previous architecture used a global registry in evaluators.ts with a massive switch statement to look up operator implementations by name. This had several problems:

  • No tree-shaking: All operators were bundled regardless of usage
  • No extensibility: Custom operators required modifying the core switch statement
  • Indirect coupling: IR nodes only stored string names, requiring runtime lookups

Approach

Moved each operator's evaluator factory into its own file alongside its builder function. The factory is now embedded directly in the Func or Aggregate IR node at construction time.

Before:

// evaluators.ts had 300+ lines of switch cases
switch (func.name) {
  case 'eq': { /* evaluator logic */ }
  case 'gt': { /* evaluator logic */ }
  // ... dozens more
}

After:

// packages/db/src/query/builder/operators/eq.ts
function eqEvaluatorFactory(compiledArgs, isSingleRow) {
  // Evaluator logic lives with builder
}

export function eq(left, right) {
  return new Func('eq', [...], eqEvaluatorFactory)
}

The IR nodes now carry their own factory:

export class Func<T = any> {
  constructor(
    public name: string,
    public args: Array<BasicExpression>,
    public factory?: EvaluatorFactory, // NEW: self-contained
  ) {}
}

Key Invariants

  1. Factory is present: Every Func/Aggregate node must have its factory attached at construction
  2. Same semantics: All operators maintain identical 3-valued logic (null handling) as before
  3. Self-contained: Each operator file exports only the builder function; imports compile the factory

Non-goals

  • Did not change the query compilation pipeline itself
  • Did not modify how aggregates participate in GROUP BY evaluation
  • Kept backward compatibility for existing queries using the barrel exports

Trade-offs

Alternative considered: Keep registry but make it dynamically populated

  • Rejected because it still requires side-effect imports and doesn't enable custom operators easily

Alternative considered: Pass factory name as generic parameter

  • Rejected because TypeScript erases generics at runtime

Chosen approach embeds the factory reference directly in the IR node, which:

  • Enables true tree-shaking (bundler sees direct imports)
  • Makes custom operators trivial (just pass your factory to the constructor)
  • Keeps all related code co-located (builder + evaluator in one file)

Verification

pnpm test
pnpm run lint:check

Key test files demonstrating the patterns:

  • packages/db/tests/query/compiler/custom-operators.test.ts - Custom between, startsWith operators
  • packages/db/tests/query/compiler/custom-aggregates.test.ts - Custom product, variance aggregates

Files Changed

New operator files (packages/db/src/query/builder/operators/):

  • eq.ts, gt.ts, gte.ts, lt.ts, lte.ts - Comparison operators
  • and.ts, or.ts, not.ts - Boolean operators
  • add.ts, subtract.ts, multiply.ts, divide.ts - Math operators
  • like.ts, ilike.ts, upper.ts, lower.ts, length.ts, concat.ts, coalesce.ts - String operators
  • in.ts, isNull.ts, isUndefined.ts - Utility operators
  • types.ts, index.ts - Shared types and barrel export

New aggregate files (packages/db/src/query/builder/aggregates/):

  • sum.ts, count.ts, avg.ts, min.ts, max.ts - Standard aggregates
  • collect.ts, minStr.ts, maxStr.ts - New aggregates
  • index.ts - Barrel export

Modified core files:

  • ir.ts - Added EvaluatorFactory, AggregateConfig types and optional factory fields to IR nodes
  • evaluators.ts - Removed 300+ lines of switch cases, now delegates to embedded factories
  • functions.ts - Simplified to import from operator modules
  • group-by.ts - Updated to use embedded aggregate configs

🤖 Generated with Claude Code

This enables tree-shaking by having each operator register its own
evaluator when imported, rather than relying on a monolithic switch
statement.

Key changes:
- Add registry.ts with registerOperator/getOperatorEvaluator APIs
- Create individual operator files (eq, gt, gte, lt, lte, and, or, not,
  in, like, ilike, upper, lower, length, concat, coalesce, add,
  subtract, multiply, divide, isNull, isUndefined)
- Each operator file bundles builder function + evaluator + registration
- Modify evaluators.ts to use registry lookup instead of switch
- Update query/index.ts to export from new operator modules
- Export compileExpressionInternal for operator modules to use

The pattern: importing an operator causes its file to execute, which
calls registerOperator, adding it to the registry. By query compile
time, all operators in use are already registered.
@changeset-bot
Copy link

changeset-bot bot commented Dec 1, 2025

🦋 Changeset detected

Latest commit: 0d37df5

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
@tanstack/db Patch
@tanstack/angular-db Patch
@tanstack/electric-db-collection Patch
@tanstack/offline-transactions Patch
@tanstack/powersync-db-collection Patch
@tanstack/query-db-collection Patch
@tanstack/react-db Patch
@tanstack/rxdb-db-collection Patch
@tanstack/solid-db Patch
@tanstack/svelte-db Patch
@tanstack/trailbase-db-collection Patch
@tanstack/vue-db Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@KyleAMathews KyleAMathews marked this pull request as draft December 1, 2025 23:23
@pkg-pr-new
Copy link

pkg-pr-new bot commented Dec 1, 2025

More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@944

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@944

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@944

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@944

@tanstack/offline-transactions

npm i https://pkg.pr.new/@tanstack/offline-transactions@944

@tanstack/powersync-db-collection

npm i https://pkg.pr.new/@tanstack/powersync-db-collection@944

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@944

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@944

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@944

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@944

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@944

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@944

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@944

commit: 0d37df5

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

Size Change: +7.24 kB (+8.03%) 🔍

Total Size: 97.3 kB

Filename Size Change
./packages/db/dist/esm/collection/subscription.js 3.64 kB +20 B (+0.55%)
./packages/db/dist/esm/index.js 2.96 kB +269 B (+9.99%) ⚠️
./packages/db/dist/esm/query/builder/aggregates/avg.js 251 B +251 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/collect.js 246 B +246 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/count.js 244 B +244 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/max.js 256 B +256 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/maxStr.js 274 B +274 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/min.js 255 B +255 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/minStr.js 273 B +273 B (new file) 🆕
./packages/db/dist/esm/query/builder/aggregates/sum.js 251 B +251 B (new file) 🆕
./packages/db/dist/esm/query/builder/functions.js 308 B -425 B (-57.98%) 🏆
./packages/db/dist/esm/query/builder/operators/add.js 247 B +247 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/and.js 336 B +336 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/coalesce.js 248 B +248 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/concat.js 290 B +290 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/divide.js 259 B +259 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/eq.js 307 B +307 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/gt.js 263 B +263 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/gte.js 266 B +266 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/ilike.js 287 B +287 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/in.js 279 B +279 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/isNull.js 211 B +211 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/isUndefined.js 222 B +222 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/length.js 258 B +258 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/like.js 431 B +431 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/lower.js 250 B +250 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/lt.js 262 B +262 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/lte.js 265 B +265 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/multiply.js 250 B +250 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/not.js 234 B +234 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/or.js 332 B +332 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/subtract.js 249 B +249 B (new file) 🆕
./packages/db/dist/esm/query/builder/operators/upper.js 249 B +249 B (new file) 🆕
./packages/db/dist/esm/query/compiler/evaluators.js 591 B -763 B (-56.35%) 🏆
./packages/db/dist/esm/query/compiler/expressions.js 437 B +7 B (+1.63%)
./packages/db/dist/esm/query/compiler/group-by.js 1.76 kB -38 B (-2.11%)
./packages/db/dist/esm/query/compiler/joins.js 2.01 kB +4 B (+0.2%)
./packages/db/dist/esm/query/ir.js 691 B +18 B (+2.67%)
./packages/db/dist/esm/query/optimizer.js 2.57 kB +14 B (+0.55%)
./packages/db/dist/esm/query/predicate-utils.js 3.03 kB +62 B (+2.09%)
./packages/db/dist/esm/utils/cursor.js 481 B +24 B (+5.25%) 🔍
ℹ️ View Unchanged
Filename Size
./packages/db/dist/esm/collection/change-events.js 1.39 kB
./packages/db/dist/esm/collection/changes.js 1.17 kB
./packages/db/dist/esm/collection/events.js 388 B
./packages/db/dist/esm/collection/index.js 3.32 kB
./packages/db/dist/esm/collection/indexes.js 1.1 kB
./packages/db/dist/esm/collection/lifecycle.js 1.67 kB
./packages/db/dist/esm/collection/mutations.js 2.34 kB
./packages/db/dist/esm/collection/state.js 3.46 kB
./packages/db/dist/esm/collection/sync.js 2.38 kB
./packages/db/dist/esm/deferred.js 207 B
./packages/db/dist/esm/errors.js 4.49 kB
./packages/db/dist/esm/event-emitter.js 748 B
./packages/db/dist/esm/indexes/auto-index.js 742 B
./packages/db/dist/esm/indexes/base-index.js 766 B
./packages/db/dist/esm/indexes/btree-index.js 1.93 kB
./packages/db/dist/esm/indexes/lazy-index.js 1.1 kB
./packages/db/dist/esm/indexes/reverse-index.js 513 B
./packages/db/dist/esm/local-only.js 837 B
./packages/db/dist/esm/local-storage.js 2.1 kB
./packages/db/dist/esm/optimistic-action.js 359 B
./packages/db/dist/esm/paced-mutations.js 496 B
./packages/db/dist/esm/proxy.js 3.75 kB
./packages/db/dist/esm/query/builder/index.js 4.01 kB
./packages/db/dist/esm/query/builder/ref-proxy.js 917 B
./packages/db/dist/esm/query/compiler/index.js 1.96 kB
./packages/db/dist/esm/query/compiler/order-by.js 1.46 kB
./packages/db/dist/esm/query/compiler/select.js 1.07 kB
./packages/db/dist/esm/query/expression-helpers.js 1.43 kB
./packages/db/dist/esm/query/live-query-collection.js 360 B
./packages/db/dist/esm/query/live/collection-config-builder.js 5.33 kB
./packages/db/dist/esm/query/live/collection-registry.js 264 B
./packages/db/dist/esm/query/live/collection-subscriber.js 1.9 kB
./packages/db/dist/esm/query/live/internal.js 130 B
./packages/db/dist/esm/query/subset-dedupe.js 921 B
./packages/db/dist/esm/scheduler.js 1.3 kB
./packages/db/dist/esm/SortedMap.js 1.3 kB
./packages/db/dist/esm/strategies/debounceStrategy.js 247 B
./packages/db/dist/esm/strategies/queueStrategy.js 428 B
./packages/db/dist/esm/strategies/throttleStrategy.js 246 B
./packages/db/dist/esm/transactions.js 2.9 kB
./packages/db/dist/esm/utils.js 924 B
./packages/db/dist/esm/utils/browser-polyfills.js 304 B
./packages/db/dist/esm/utils/btree.js 5.61 kB
./packages/db/dist/esm/utils/comparison.js 852 B
./packages/db/dist/esm/utils/index-optimization.js 1.51 kB
./packages/db/dist/esm/utils/type-guards.js 157 B

compressed-size-action::db-package-size

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

Size Change: 0 B

Total Size: 3.35 kB

ℹ️ View Unchanged
Filename Size
./packages/react-db/dist/esm/index.js 225 B
./packages/react-db/dist/esm/useLiveInfiniteQuery.js 1.17 kB
./packages/react-db/dist/esm/useLiveQuery.js 1.12 kB
./packages/react-db/dist/esm/useLiveSuspenseQuery.js 431 B
./packages/react-db/dist/esm/usePacedMutations.js 401 B

compressed-size-action::react-db-package-size

Export the registry API so users can create and register their own
custom operators. Also adds a changeset for this patch.
- Export registerOperator, EvaluatorFactory, and CompiledExpression types
- Add comprehensive tests for custom operator registration
- Tests cover between, startsWith, isEmpty, modulo operators
- Demonstrates full pattern: builder function + evaluator + registration
Extend the auto-registration pattern to aggregates (sum, count, avg, min, max):

- Create aggregate-registry.ts with registerAggregate/getAggregateConfig
- Split each aggregate into its own module with builder + auto-registration
- Update group-by.ts to use registry lookup instead of switch statement
- Export registerAggregate and types from public API
- Add tests demonstrating custom aggregate registration

This enables tree-shaking for aggregates and allows users to register
custom aggregates like 'product', 'variance', etc.
- Remove eager imports from evaluators.ts and group-by.ts
- Make functions.ts re-export from operator/aggregate modules
- Add shared types.ts for type helpers preserving nullability
- Update test files to import operators for direct IR testing

Now operators/aggregates are only loaded when user imports them,
enabling true tree-shaking. The compiler no longer pre-loads all
evaluators - they register when their builder functions are imported.
@KyleAMathews KyleAMathews marked this pull request as ready for review December 2, 2025 15:30
Replace global registry pattern with embedded factories for true tree-shaking:
- Func nodes now carry their evaluator factory directly
- Aggregate nodes now carry their config (factory + valueTransform) directly
- Remove registry.ts and aggregate-registry.ts files
- Update all operators to pass factory as 3rd argument to Func
- Update all aggregates to pass config as 3rd argument to Aggregate
- Update internal code (optimizer, predicate-utils, expressions) to
  preserve factories when transforming Func nodes
- Add array overloads to and() and or() for internal usage
- Update tests to use builder functions instead of creating IR directly

This design eliminates the need for side-effect imports and ensures
only imported operators/aggregates are bundled.
@KyleAMathews KyleAMathews force-pushed the claude/auto-register-operators-016i3ACKCCt2WbkPRApMpqju branch from 3440d6d to 57b0eeb Compare December 3, 2025 14:52
KyleAMathews and others added 3 commits January 7, 2026 08:35
…operators-016i3ACKCCt2WbkPRApMpqju

# Conflicts:
#	packages/db/src/query/builder/functions.ts
#	packages/db/src/query/compiler/evaluators.ts
#	packages/db/src/query/compiler/group-by.ts
#	packages/db/src/query/index.ts
#	packages/db/src/query/ir.ts
#	packages/db/src/query/optimizer.ts
#	packages/db/src/query/predicate-utils.ts
#	packages/db/tests/collection-change-events.test.ts
#	packages/db/tests/query/compiler/basic.test.ts
#	packages/db/tests/query/compiler/evaluators.test.ts
#	packages/db/tests/query/compiler/select.test.ts
@samwillis
Copy link
Collaborator

Reviewed with the help of @kevin-dp and Opus 4.5:

I think the proposed (below) defineOperator and defineAggregate should be the documented public api. We can then always change the internal details of how the IR is built from an operator without leaking it in a public api.


PR 944 Review: Comprehensive Analysis

Summary

This PR refactors the operator and aggregate system to embed evaluator factories directly in the IR nodes (Func.factory, Aggregate.config). This eliminates the need for a global registry, making nodes self-contained and enabling tree-shaking. The core architecture is sound, but there are opportunities to reduce duplication and formalize the public API for custom operators.


What's Good ✅

1. Self-Contained IR Nodes
The key insight—storing the evaluator factory on the Func node and the aggregate config on the Aggregate node—is correct. This means:

  • No global registry with side effects at import time
  • Each node carries everything needed for compilation
  • IR transformations can preserve behavior by copying the factory/config

2. Factory Preservation in IR Transforms
The PR correctly updates places where IR nodes are reconstructed to preserve the factory:

  • normalizeExpressionPaths() - preserves whereClause.factory
  • replaceAggregatesByRefs() - preserves funcExpr.factory
  • combineWithAnd() - uses andBuilder() instead of raw new Func()

3. 3-Valued Logic Consistency
Operators properly implement SQL-like 3-valued logic (true/false/null) throughout.

4. New Aggregates
collect, minStr, maxStr are useful additions with proper documentation.

5. Tests
The custom operator and aggregate tests (custom-operators.test.ts, custom-aggregates.test.ts) demonstrate the pattern clearly.


Issues to Address 🔧

1. Duplicated isUnknown Helper Function

The isUnknown function is duplicated across 10+ files:

  • and.ts, or.ts, not.ts, in.ts, ilike.ts, eq.ts, gt.ts, gte.ts, lt.ts, lte.ts, like.ts
function isUnknown(value: any): boolean {
  return value === null || value === undefined
}

Recommendation: Extract to a shared module (e.g., operators/shared.ts or operators/utils.ts).

2. Duplicated Type Definitions

Several files define local types that already exist in types.ts:

  • isNull.ts, isUndefined.ts, coalesce.ts, concat.ts, not.ts, multiply.ts, subtract.ts, divide.ts
// Duplicated in multiple files
type ExpressionLike = BasicExpression | any

Recommendation: Import from ./types.js instead of redefining locally.

3. Inconsistent BinaryNumericReturnType

Files multiply.ts, subtract.ts, divide.ts define their own simplified BinaryNumericReturnType:

type BinaryNumericReturnType<_T1, _T2> = BasicExpression<number | undefined | null>

But types.ts has a more sophisticated version that properly preserves nullability. Use the shared type.

4. Outdated Comment in functions.ts

// Re-export all operators from their individual modules
// Each module auto-registers its evaluator when imported  <-- This is outdated

There's no "auto-registration" anymore—each node is self-contained. Update the comment.


Proposed Enhancements 📝

After extensive discussion, we propose two additions to this PR:

Proposal 1: Factory Generator Helpers

Create higher-order functions that generate evaluator factories for common patterns. This eliminates duplication while keeping the architecture intact.

New file: operators/factories.ts

import type { CompiledExpression, EvaluatorFactory } from '../../ir.js'

/** Check if value is null/undefined (UNKNOWN in 3-valued logic) */
export const isUnknown = (v: any): boolean => v === null || v === undefined

/**
 * Creates a factory for binary comparison operators (eq, gt, lt, gte, lte)
 * Handles 3-valued logic automatically.
 */
export function comparison(
  compare: (a: any, b: any) => boolean,
): EvaluatorFactory {
  return ([argA, argB]) => (data) => {
    const a = argA!(data)
    const b = argB!(data)
    if (isUnknown(a) || isUnknown(b)) return null
    return compare(a, b)
  }
}

/**
 * Creates a factory for variadic boolean operators (and, or)
 */
export function booleanOp(config: {
  shortCircuit: boolean  // false for AND, true for OR
  default: boolean       // true for AND, false for OR
}): EvaluatorFactory {
  return (args) => (data) => {
    let hasUnknown = false
    for (const arg of args) {
      const result = arg(data)
      if (result === config.shortCircuit) return config.shortCircuit
      if (isUnknown(result)) hasUnknown = true
    }
    return hasUnknown ? null : config.default
  }
}

/**
 * Creates a factory for unary transforms (upper, lower, not, isNull, etc.)
 */
export function transform<R>(fn: (value: any) => R): EvaluatorFactory {
  return ([arg]) => (data) => fn(arg!(data))
}

/**
 * Creates a factory for binary numeric operators (add, subtract, multiply, divide)
 */
export function numeric(
  operation: (a: number, b: number) => number | null,
  defaultValue: number = 0,
): EvaluatorFactory {
  return ([argA, argB]) => (data) => {
    const a = argA!(data) ?? defaultValue
    const b = argB!(data) ?? defaultValue
    return operation(a, b)
  }
}

Simplified built-in operators (using helpers internally):

// gt.ts - reduced from ~75 lines to ~10 lines
import { Func } from '../../ir.js'
import { toExpression } from '../ref-proxy.js'
import { comparison } from './factories.js'
import type { BasicExpression } from '../../ir.js'
import type { ComparisonOperand } from './types.js'

const gtFactory = comparison((a, b) => a > b)

export function gt<T>(left: ComparisonOperand<T>, right: ComparisonOperand<T>): BasicExpression<boolean> {
  return new Func(`gt`, [toExpression(left), toExpression(right)], gtFactory)
}
// and.ts - reduced from ~85 lines to ~15 lines
import { Func } from '../../ir.js'
import { toExpression } from '../ref-proxy.js'
import { booleanOp } from './factories.js'
import type { BasicExpression } from '../../ir.js'
import type { ExpressionLike } from './types.js'

const andFactory = booleanOp({ shortCircuit: false, default: true })

export function and(...args: Array<ExpressionLike>): BasicExpression<boolean>
export function and(args: Array<ExpressionLike>): BasicExpression<boolean>
export function and(...args: any[]): BasicExpression<boolean> {
  const exprs = args.length === 1 && Array.isArray(args[0]) ? args[0] : args
  return new Func(`and`, exprs.map(toExpression), andFactory)
}

Proposal 2: Public defineOperator API

Provide a clean public API for users to create custom operators, abstracting the internal Func constructor.

// Exported from @tanstack/db
export interface OperatorConfig {
  name: string
  evaluate: EvaluatorFactory
}

/**
 * Define a custom operator.
 * 
 * @example
 * ```typescript
 * import { defineOperator, isUnknown } from '@tanstack/db'
 * 
 * const between = defineOperator<boolean>({
 *   name: 'between',
 *   evaluate: ([value, min, max]) => (data) => {
 *     const v = value!(data)
 *     if (isUnknown(v)) return null
 *     return v >= min!(data) && v <= max!(data)
 *   }
 * })
 * 
 * // Usage
 * query.where(({ user }) => between(user.age, 18, 65))
 * ```
 */
export function defineOperator<T = any>(
  config: OperatorConfig,
): (...args: Array<any>) => Func<T> {
  const { name, evaluate } = config
  return (...args) => new Func(name, args.map(toExpression), evaluate)
}

Similarly for aggregates:

export function defineAggregate<T = any>(
  config: AggregateConfig & { name: string },
): (arg: any) => Aggregate<T> {
  const { name, ...aggregateConfig } = config
  return (arg) => new Aggregate(name, [toExpression(arg)], aggregateConfig)
}

Tree-Shaking Analysis 🌳

We ran extensive tests to verify tree-shaking behavior across bundlers.

Test Setup

Created a file with multiple operators in the same module:

export const gt = defineOperator({ name: 'gt', evaluate: comparison((a, b) => a > b) })
export const lt = defineOperator({ name: 'lt', evaluate: comparison((a, b) => a < b) })
export const eq = defineOperator({ name: 'eq', evaluate: comparison((a, b) => a === b) })
// etc.

Then imported only gt:

import { gt } from './operators.js'
console.log(gt(1, 2))

Results

Bundler Pattern Tree-shakes unused exports?
Rollup export const gt = defineOperator(...) ✅ Yes
Rollup export function gt(...) { ... } ✅ Yes
esbuild export const gt = defineOperator(...) ❌ No
esbuild export function gt(...) { ... } ✅ Yes
esbuild export const gt = /*#__PURE__*/ (()=>...)() ✅ Yes

Analysis

  • Rollup (used by Vite production builds, SvelteKit, etc.) correctly tree-shakes both patterns.
  • esbuild (used by Vite dev builds) is conservative about top-level function calls like defineOperator(...). It cannot prove they're side-effect-free, so it keeps them.
  • The export function pattern works universally because there's no top-level computation.

Recommendation

Built-in operators should use export function style (which the current PR already does). This ensures reliable tree-shaking across all bundlers.

Do NOT use defineOperator for built-in operators internally. While it works with Rollup, it can cause issues with esbuild-based bundlers. Reserve defineOperator as a public API for users creating custom operators—they'll typically define only the operators they need, so tree-shaking of their custom operators is less critical.


Design Alternatives Considered

Class-Based Approach

We explored making Func abstract and subclassable:

abstract class Operator<T> extends BaseExpression<T> {
  abstract readonly name: string
  abstract evaluate(compiledArgs: Array<CompiledExpression>): CompiledExpression
}

class GtOperator extends Operator<boolean> {
  readonly name = 'gt'
  evaluate([a, b]: Array<CompiledExpression>) {
    return (data) => { /* ... */ }
  }
}

export const gt = (left: any, right: any) => 
  new GtOperator([toExpression(left), toExpression(right)])

Pros:

  • Each operator is a distinct type (instanceof GtOperator)
  • More traditional OOP pattern
  • Could have convenience base classes (BinaryComparison, UnaryTransform, etc.)

Cons:

  • More boilerplate per operator (class + export)
  • Requires maintaining parallel class hierarchy
  • Less idiomatic for a functional query builder API

Decision: Stay with the functional approach. It's more aligned with modern JS patterns, results in less code, and the defineOperator helper provides a clean public API without requiring users to understand class hierarchies.


Checklist for Merge

  • Extract duplicated isUnknown to shared module
  • Use shared types from types.ts instead of local definitions
  • Update outdated "auto-registers" comment in functions.ts
  • Consider adding factory generator helpers (comparison, booleanOp, transform, numeric)
  • Add defineOperator and defineAggregate to public API
  • Document the custom operator/aggregate pattern in README or docs
  • Ensure built-ins use export function pattern (already done ✅)

Verdict

Approve with minor changes. The core architecture is solid. The factory-on-node approach is the right design. The suggestions above are improvements to reduce duplication and formalize the public API, but the PR is fundamentally correct and can be merged after addressing the duplication issues.

@samwillis
Copy link
Collaborator

@KyleAMathews I had my Cursor session open after the review, and have asked it to try implementing these changes. Will open in a bit with a PR stacked on this, you can decide if you want to fold it in.

Copy link
Collaborator Author

@KyleAMathews KyleAMathews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #944

Overall Assessment: Approve

This is a beautifully architected refactoring that brings the benefits of tree-shaking and extensibility to TanStack DB operators. Like the Articles of Faith encapsulating core principles, each operator file now encapsulates its complete implementation - builder function and evaluator factory together.

Summary

The PR refactors operators and aggregates from a monolithic switch statement to self-contained modules that embed their evaluator factories directly in IR nodes. This enables:

  • True tree-shaking (only import what you use)
  • Custom operators without modifying core code
  • Co-located builder + evaluator code

Architecture Highlights

Before:

// evaluators.ts - 300+ line switch statement
switch (func.name) {
  case 'eq': { /* evaluator logic */ }
  case 'gt': { /* evaluator logic */ }
  // ... dozens more
}

After:

// packages/db/src/query/builder/operators/eq.ts
const eqEvaluatorFactory: EvaluatorFactory = (compiledArgs, isSingleRow) => {
  // Evaluator logic lives with builder
}

export function eq(left, right) {
  return new Func('eq', [...], eqEvaluatorFactory)
}

Strengths

  1. Significant bundle size reduction: Users who only use eq and and don't pay for like, ilike, concat, etc.

  2. Excellent extensibility story: Custom operators are now trivial:

    function between(value, min, max) {
      return new Func('between', [...], betweenFactory)
    }
  3. New aggregates: Added collect, minStr, maxStr - useful additions especially for timestamp handling.

  4. Co-location: Each operator file contains everything needed to understand that operator.

  5. Comprehensive tests: custom-operators.test.ts and custom-aggregates.test.ts demonstrate the extensibility patterns.

Technical Notes

  1. valueTransform options: The aggregate configs support numeric, numericOrDate, and raw transforms - good flexibility.

  2. Three-valued logic preserved: All operators maintain SQL-style null handling (null && true === null, etc.)

  3. Backward compatibility: Barrel exports maintain the existing import paths.

Minor Suggestions

  1. types.ts:67-119: The ExtractType and AggregateReturnType utility types are well-designed. Consider adding JSDoc explaining the type inference logic.

  2. Documentation addition: The live-queries.md addition for minStr, maxStr, collect is good. Consider adding a "Custom Operators" guide showing the extensibility pattern.

  3. Consider a validation pattern: For custom operators, it might be helpful to have a validateFactory helper that checks the factory signature at development time.

Question

For the custom operator/aggregate patterns, is there a recommended way to type-check that the factory matches the expected signature at compile time? The EvaluatorFactory type ensures runtime compatibility, but compile-time validation would be even better.

Excellent refactoring that makes the codebase more modular and tree-shakable! 🌳✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready for review

Development

Successfully merging this pull request may close these issues.

5 participants