fix(extension-arktype-json, sql-runtime): correctness gaps from ADR 208 landing#418
fix(extension-arktype-json, sql-runtime): correctness gaps from ADR 208 landing#418
Conversation
📝 WalkthroughWalkthroughPreserve JSON-schema validation error codes on encode/decode paths; widen arktype-json codec to accept pre-parsed JsonValue, add a metadata-only emit codec instance to the runtime, enforce runtime type-params round-trip, mark some parameterized codecs as encode-params-independent, and document partial no-emit typing for parameterized columns. ChangesArktype JSON, runtime codec registry, and JSON-schema error pass-through
Sequence DiagramsequenceDiagram
participant Client as Client
participant Runtime as Runtime
participant Codec as Codec
participant DB as PostgresDB
Client->>Runtime: encodeParam(value)
Runtime->>Codec: codec.encode(wireValue)
alt codec throws RUNTIME.JSON_SCHEMA_VALIDATION_FAILED
Codec-->>Runtime: throw { code: RUNTIME.JSON_SCHEMA_VALIDATION_FAILED }
Runtime-->>Client: rethrow unchanged
else codec throws pre-wrapped RUNTIME.ENCODE_FAILED
Codec-->>Runtime: throw { code: RUNTIME.ENCODE_FAILED, details }
Runtime-->>Client: rethrow unchanged
else success
Codec-->>Runtime: encodedParam
Runtime->>DB: send param
DB-->>Runtime: stored
Runtime-->>Client: success
end
Client->>Runtime: decodeRow(row)
Runtime->>Codec: codec.decode(wire)
alt codec throws RUNTIME.JSON_SCHEMA_VALIDATION_FAILED
Codec-->>Runtime: throw { code: RUNTIME.JSON_SCHEMA_VALIDATION_FAILED }
Runtime-->>Client: rethrow unchanged
else codec throws pre-wrapped RUNTIME.DECODE_FAILED
Codec-->>Runtime: throw { code: RUNTIME.DECODE_FAILED, details }
Runtime-->>Client: rethrow unchanged
else success
Codec-->>Runtime: decodedValue
Runtime-->>Client: return value
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
@prisma-next/mongo-runtime
@prisma-next/family-mongo
@prisma-next/sql-runtime
@prisma-next/family-sql
@prisma-next/extension-arktype-json
@prisma-next/middleware-telemetry
@prisma-next/mongo
@prisma-next/extension-paradedb
@prisma-next/extension-pgvector
@prisma-next/postgres
@prisma-next/sql-orm-client
@prisma-next/sqlite
@prisma-next/target-mongo
@prisma-next/adapter-mongo
@prisma-next/driver-mongo
@prisma-next/contract
@prisma-next/utils
@prisma-next/config
@prisma-next/errors
@prisma-next/framework-components
@prisma-next/operations
@prisma-next/ts-render
@prisma-next/contract-authoring
@prisma-next/ids
@prisma-next/psl-parser
@prisma-next/psl-printer
@prisma-next/cli
@prisma-next/emitter
@prisma-next/migration-tools
prisma-next
@prisma-next/vite-plugin-contract-emit
@prisma-next/mongo-codec
@prisma-next/mongo-contract
@prisma-next/mongo-value
@prisma-next/mongo-contract-psl
@prisma-next/mongo-contract-ts
@prisma-next/mongo-emitter
@prisma-next/mongo-schema-ir
@prisma-next/mongo-query-ast
@prisma-next/mongo-orm
@prisma-next/mongo-query-builder
@prisma-next/mongo-lowering
@prisma-next/mongo-wire
@prisma-next/sql-contract
@prisma-next/sql-errors
@prisma-next/sql-operations
@prisma-next/sql-schema-ir
@prisma-next/sql-contract-psl
@prisma-next/sql-contract-ts
@prisma-next/sql-contract-emitter
@prisma-next/sql-lane-query-builder
@prisma-next/sql-relational-core
@prisma-next/sql-builder
@prisma-next/target-postgres
@prisma-next/target-sqlite
@prisma-next/adapter-postgres
@prisma-next/adapter-sqlite
@prisma-next/driver-postgres
@prisma-next/driver-sqlite
commit: |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/architecture docs/adrs/ADR 208 - Higher-order codecs for parameterized types.md (1)
116-120: ⚡ Quick winTrim the resolver mechanics from the ADR.
This section is drifting into implementation flow (
typeReftraversal,CodecInstanceContextapplication,Jsextraction). The ADR reads cleaner if it stays at the architectural contract level and leaves the step-by-step resolver behavior to the subsystem docs.
As per coding guidelines: ADRs in prisma/prisma-next should document architectural design decisions and the key constraints/assumptions only; avoid implementation algorithms and step-by-step derivation/migration rules.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/architecture` docs/adrs/ADR 208 - Higher-order codecs for parameterized types.md around lines 116 - 120, Remove the low-level resolver steps from the ADR and keep it at the architectural-contract level: delete or reword the paragraph that describes following typeRef through storage.types, synthetically applying CodecInstanceContext, and extracting the Js parameter; instead state that FieldOutputType<Definition, Model, Field> will be able to resolve parameterized column types via the ColumnTypeDescriptor's authoring-only type slot (type: (ctx: CodecInstanceContext) => Codec<…, Js>) and that for non-parameterized columns it will fall back to CodecTypes[codecId]['output'], while nullability is preserved—leave the step-by-step traversal and application details to subsystem documentation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/architecture` docs/adrs/ADR 208 - Higher-order codecs for parameterized
types.md:
- Around line 116-120: Remove the low-level resolver steps from the ADR and keep
it at the architectural-contract level: delete or reword the paragraph that
describes following typeRef through storage.types, synthetically applying
CodecInstanceContext, and extracting the Js parameter; instead state that
FieldOutputType<Definition, Model, Field> will be able to resolve parameterized
column types via the ColumnTypeDescriptor's authoring-only type slot (type:
(ctx: CodecInstanceContext) => Codec<…, Js>) and that for non-parameterized
columns it will fall back to CodecTypes[codecId]['output'], while nullability is
preserved—leave the step-by-step traversal and application details to subsystem
documentation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 98367176-41fe-4e0e-9a95-53b519c6db44
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (11)
docs/architecture docs/adrs/ADR 208 - Higher-order codecs for parameterized types.mddocs/architecture docs/subsystems/9. No-Emit Workflow.mdpackages/2-sql/5-runtime/src/codecs/decoding.tspackages/2-sql/5-runtime/test/json-schema-validation.test.tspackages/3-extensions/arktype-json/README.mdpackages/3-extensions/arktype-json/package.jsonpackages/3-extensions/arktype-json/src/core/arktype-json-codec.tspackages/3-extensions/arktype-json/src/exports/runtime.tspackages/3-extensions/arktype-json/test/arktype-json-codec.test-d.tspackages/3-extensions/arktype-json/test/arktype-json-codec.test.tspackages/3-extensions/arktype-json/test/extension-descriptors.test.ts
…amily The previous fix typed `arktypeJsonEmitCodec` against the SQL `Codec` extension solely to gain the optional `meta` field. That coupled the family-agnostic `codecInstances` slot to the SQL family — a future non-SQL arktype variant (Mongo, etc.) would need a parallel stub or a refactor of which interface owns `meta`. Drop the SQL `Codec` import. Type the stub as the framework `Codec` intersected with a co-located `SqlNativeTypeMeta` structural shape that mirrors what the SQL renderer reads. The renderer's existing `as SqlCodec | undefined` cast at the lookup boundary consumes `meta` structurally regardless of the source declaration, so behavior is unchanged. Refs #418 self-review (subideal item #1).
…ugh codec.encode Symmetric to the decode-side guard added in 0574f04. Per-library JSON- with-schema codecs validate inside `encode` (same shape as their `decode` validation per ADR 208 § Case J) and throw the stable code directly. Without this rethrow, `wrapEncodeFailure` re-wraps it as `RUNTIME.ENCODE_FAILED` and the documented stable-code contract breaks on the write side too. Adds a regression test mirroring the decode-side one: a stub codec whose `encode` throws `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`, asserted to pass through `encodeParam` unchanged. Refs #418 self-review (encode-side gap surfaced while writing the e2e schema-rejection test).
The `Embedding.profile` column has been in the e2e fixture since the
arktype-json landing but no test ever wrote or read it — which is why
the runtime cast lookup gap and the JSON.parse decode mismatch slipped
through. Add direct coverage:
1. Happy round-trip — `Embedding.create({ profile: { ... } })` followed
by a where-eq read. Exercises the full encode path (schema validate
→ JSON.stringify → SQL renderer cast lookup → driver write) and the
full decode path (driver read with pre-parsed jsonb → schema validate
→ ORM result).
2. Schema rejection on write — confirms `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`
surfaces unchanged through the encode-side rethrow guard.
Decode-side rejection is already covered by the unit test in
sql-runtime/test/json-schema-validation.test.ts; not duplicated here.
Refs #418 self-review (subideal item #2).
…k fires The README's Compatibility section explained the round-trip invariant and the version pin but didn't say when the runtime defense actually runs. Add one sentence: the factory runs at execution-context construction time (typically `runtime.connect()`), so a stale-but-shape- valid contract.json fails fast at startup rather than on first query or — worse — by silently returning wrong types. Resolves the self-review concern about hard-fail timing: there is no earlier hook to use without architectural changes (the round-trip check requires materializing the schema, which neither paramsSchema validation nor `prisma-next verify` does today). Refs #418 self-review (subideal item #3).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/2-sql/5-runtime/src/codecs/encoding.ts`:
- Around line 126-140: The catch block in codec.encode currently only preserves
RUNTIME.JSON_SCHEMA_VALIDATION_FAILED; update it to also pass-through existing
RUNTIME.ENCODE_FAILED envelopes so they are not re-wrapped. Concretely, in the
catch for codec.encode, check if isRuntimeError(error) and if error.code is
either 'RUNTIME.JSON_SCHEMA_VALIDATION_FAILED' or 'RUNTIME.ENCODE_FAILED' then
rethrow the error; otherwise call wrapEncodeFailure(error, metadata, paramIndex,
codec.id). This uses the existing symbols codec.encode, isRuntimeError, and
wrapEncodeFailure to locate and modify the logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 2514a56f-f827-4262-9205-60a0ad1599aa
📒 Files selected for processing (5)
packages/2-sql/5-runtime/src/codecs/encoding.tspackages/2-sql/5-runtime/test/json-schema-validation.test.tspackages/3-extensions/arktype-json/README.mdpackages/3-extensions/arktype-json/src/core/arktype-json-codec.tstest/e2e/framework/test/arktype-json.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/3-extensions/arktype-json/README.md
…r flag The runtime registry rejects `forCodecId(codecId)` lookups when multiple distinct resolved instances share a parameterized codec id (e.g. two `vector(N)` columns of different lengths, or two `arktypeJson(...)` columns with different schemas). Today's encode dispatch carries only the codec id (no `(table, column)` ref — TML-2357 § AC-5), so without the rejection a DSL-param could silently bind to the wrong instance. But for codecs whose `encode` is structurally equivalent across all resolved instances (pgvector, arktype-json with schema-independent encode), the rejection is over-conservative: any registered instance encodes correctly, so picking one is safe. Without an opt-out, contracts with multiple parameterized columns of the same codec id break at encode time before `ParamRef.refs` lands. Add `encodeIsParamsIndependent?: boolean` to `CodecDescriptor`. The runtime registry skips the ambiguity-marker for descriptors that opt in. Decode dispatch still uses `forColumn(table, column)` to get the instance-specific schema, so the per-column type check remains correct. The flag becomes vestigial when AC-5 lands. Adds two unit tests in `contract-codec-registry.test.ts`: the default ambiguity rejection (proves the guard still fires) and the `encodeIsParamsIndependent` opt-out path (proves multi-instance contracts work end-to-end at the registry level). Refs Codex review of #418 (F01 enabling change).
…rate pre-parsed JSON primitives Two correctness gaps Codex's review of #418 surfaced: F01 — encode is now params-dependent. Pre-existing on origin/main (`8edf6cde0`): `serializeToJsonSafe` validated against the per-instance schema. With multiple `arktypeJson(...)` columns the resolved codecs differ by reference, the registry's ambiguity guard fires under any codec-id-only encode lookup, and ordinary creates/updates/filters throw `RUNTIME.TYPE_PARAMS_INVALID` before encoding can run. Drop the encode-side `validateSchema(json)` call. Validation runs on decode only, matching the JSON-validator philosophy: payloads can come from any source (this writer, a previous schema version, a manual SQL `INSERT`), so validate when reading. Pair with the new `encodeIsParamsIndependent` descriptor flag (previous commit) — `arktypeJsonCodec` opts in. F02 — pre-parsed JSON string primitives fail decode. The `pg` driver returns `jsonb` cells as already-parsed JS values by default, including primitives. For an arktype `type('string')` schema, a stored `"alice"` arrives as the bare JS string `alice` and `JSON.parse(wire)` throws SyntaxError before the schema gets a chance. Replace the binary `typeof === 'string' ? JSON.parse : passthrough` branch with a `parseWireValue` helper that tries `JSON.parse` first and falls back to the original wire on parse failure. Covers raw-JSON-text drivers, pre-parsed objects/arrays/numbers/bools/null, and pre-parsed JSON string primitives. Documented edge case: a stored JSON string `"123"` arrives pre-parsed as `123`, parses to the number `123` — collision matches what `pgJsonbCodec` does today and is intrinsic to the driver-mode ambiguity. Test changes: - arktype-json-codec.test.ts: encode-validation expectations updated to the round-trip-only contract (existing tests where the runtime check was pinning encode-side schema rejection are reframed as encode normalization tests). Three new F02 tests: pre-parsed string primitive happy path, raw-JSON-text wire still parses, pre-parsed primitive fails schema validation. - e2e arktype-json.test.ts: schema-rejection test rewritten — write with a schema-violating payload commits, the subsequent read throws `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`. Mirrors the new validate-on- read contract. Refs Codex review of #418 (F01, F02).
Three sections still reflected the pre-PR-418 state of the world: - "Type fidelity end-to-end" claimed `vector(1536)` and `arktypeJson(schema)` resolve in the no-emit path. They don't — the resolver wiring is tracked under TML-2357 and the partial- status block earlier in the ADR already says so. Update the consequence to match. - "Encode-side `forCodecId` legacy fallback" said arktype-json's encode is `JSON.stringify` (true) and that the fallback works because encode is per-instance-stateless w.r.t. params. The PR-418 review surfaced that this required keeping encode schema-independent, which is now an explicit design choice. Add the `encodeIsParamsIndependent` flag mechanism that lets multi- column contracts work pre-AC-5. - "Per-library JSON extensions" section said the descriptor's factory "validates internally in `decode`". Make the asymmetry explicit: validation runs on decode only; encode is `JSON.stringify` with no schema check, matching the validate-on-read philosophy. - "Resolves" bullet for TML-2229 claimed both no-emit AND emit paths resolve correctly. Only emit ships in #402; no-emit waits for TML-2357. Split the bullet to reflect the partial state. These updates close Codex's F03. The previously-added partial-status block at § "No-emit type resolution" stays — this commit just brings the rest of the ADR into agreement with it. Refs Codex review of #418 (F03).
…r descriptor Pgvector's wire format `[v1,v2,...]` is dimension-independent. Today the parameterized factory dodges the registry's reference-based ambiguity check by returning `sharedVectorCodec` for every params, so two `vector(N)` columns of different lengths happen to share one codec instance. That keeps multi-vector contracts working without the new `encodeIsParamsIndependent` opt-out — but only by accident of the factory being a constant. The comment on `vectorFactory` already foreshadows a refactor that would close over `params` (e.g. capping wire length to declared dimension). Such a refactor would produce reference-distinct instances and trip the registry's ambiguity guard, breaking multi-vector contracts silently. Declare `encodeIsParamsIndependent: true` on the parameterized vector descriptor to pin the encode-equivalence guarantee at the descriptor level. Future implementation changes either preserve the invariant or remove the flag — the registry will tell you which. Adds a regression test in `manifest.test.ts` asserting the flag is declared. Refs Codex review of #418 (encodeIsParamsIndependent applied beyond arktype-json).
Postgres `pg` returns `jsonb` cells as already-parsed JS values by default. The arktype-json codec was hardcoded to `JSON.parse(wire)` and would SyntaxError on every read where the driver pre-parsed. Mirror `pgJsonbCodec`'s `string | JsonValue` tolerance so selects against arktypeJson columns work end-to-end. Widens `ArktypeJsonCodec`'s `TWire` from `string` to `string | JsonValue` to match the runtime tolerance. Adds two test cases covering both wire shapes for `decode`. Refs #402 review (P1 #2).
… lookup The arktype-json runtime descriptor did not expose `types.codecTypes.codecInstances`, so the postgres adapter's `extractCodecLookup` (called at execution time, not control time) had no entry for `arktype/json@1`. The first query touching an arktypeJson column failed in `renderTypedParam` with "assembled codec lookup has no entry" before encode/decode dispatch ever ran. pgvector's runtime descriptor already follows the correct pattern. Two changes: 1. `arktypeJsonRuntimeDescriptor.types.codecTypes.codecInstances` surfaces `arktypeJsonEmitCodec` so the runtime cast lookup resolves. 2. `arktypeJsonEmitCodec` now carries `meta.db.sql.postgres.nativeType = 'jsonb'`. The framework `Codec` type doesn't expose `meta`; switch the declaration to the SQL `Codec` extension which does. Without this, `renderTypedParam` would emit `$N` instead of `$N::jsonb` — and `jsonb` is deliberately excluded from `POSTGRES_INFERRABLE_NATIVE_TYPES`, so the cast is required for filter expressions. Renames the codec from "emit-only shim" to "metadata-only stub" to reflect that it now serves both emit-path renderOutputType lookup AND runtime cast policy. Encode/decode are still sentinels that throw if invoked — runtime materialization goes through the unified descriptor map's `factory(params)(ctx)`. Adds three regression assertions in `extension-descriptors.test.ts` covering: presence in `codecInstances`, resolution through `extractCodecLookup`, and the `jsonb` native-type metadata field. Refs #402 review (P1 #1).
…ugh codec.decode ADR 208 § Case J says per-library JSON-with-schema codecs (e.g. `arktype/json@1`) validate inside `decode` and surface the stable `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED` code unchanged. The runtime did not honor that contract: `decodeField` caught all `codec.decode` errors and rewrapped them as `RUNTIME.DECODE_FAILED`. The rethrow guard for the schema-validation code only existed for the legacy post-decode `validateJsonValue` path. Add the same guard to the codec-decode catch arm. Adds a regression test in `json-schema-validation.test.ts` that registers a stub codec whose `decode` throws `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED` and asserts the code passes through `decodeRow` unchanged. Refs #402 review (P2 #3).
…-implemented ADR 208 § "No-emit type resolution" and the No-Emit Workflow subsystem described `FieldOutputType` as following `typeRef` and applying the column's authoring-time `type` slot to produce `Vector<1536>` and arktype- schema-shaped types in no-emit mode. The resolver isn't wired up: `@prisma-next/sql-contract-ts`'s `FieldOutputType` resolves through `CodecTypesFromDefinition[codecId]['output']` only, and the arktype-json codec-types map declares the `arktype/json@1` `output` as `unknown`. So authors who skip `pnpm emit` get codec-base types, not the parameterized refinements that emit produces. Mark the resolver as not-yet-implemented in the ADR, the no-emit subsystem, and the arktype-json README. Reference TML-2357 for tracked follow-up. The eventual target shape stays in the docs as the design intent — what changed is the status, not the design. Refs #402 review (P2 #4).
…ent round-trip stability The codec's emit-path output is derived from `schema.expression`. Runtime rehydration via `ark.schema(typeParams.jsonIr)` reproduces that string only as long as arktype's IR-to-expression rendering doesn't change. A minor or major arktype bump can therefore stale-render existing `contract.json` blobs without a corresponding re-emit. Pin the dependency to `~2.1.29` (patch range only) and add a "Compatibility" section to the README explaining the invariant and what consumers should do when they bump arktype outside this range. Refs #402 review (other concerns — arktype version pinning).
Previously, when the rehydrated schema's `expression` didn't match `typeParams.expression`, the factory emitted a `console.warn` and kept going. The branch was wrapped in `c8 ignore`, which made it untested. Two problems with the warning approach. First, by the time the runtime materializes the codec, the bad `contract.d.ts` has already been emitted and shipped — a runtime warning fires after the wrong types are in consumer code. Second, the expression-vs-rehydrated round-trip is exactly the invariant the package's arktype version pin protects. A silent warning is the wrong response when the invariant breaks. Throw `RUNTIME.TYPE_PARAMS_INVALID` instead, with a message that points the caller at re-running `pnpm emit`. Removes the `c8 ignore` so the branch is testable; adds two tests (one for divergence, one for the matching happy path). Refs #402 review (other concerns — expression divergence handler).
…amily The previous fix typed `arktypeJsonEmitCodec` against the SQL `Codec` extension solely to gain the optional `meta` field. That coupled the family-agnostic `codecInstances` slot to the SQL family — a future non-SQL arktype variant (Mongo, etc.) would need a parallel stub or a refactor of which interface owns `meta`. Drop the SQL `Codec` import. Type the stub as the framework `Codec` intersected with a co-located `SqlNativeTypeMeta` structural shape that mirrors what the SQL renderer reads. The renderer's existing `as SqlCodec | undefined` cast at the lookup boundary consumes `meta` structurally regardless of the source declaration, so behavior is unchanged. Refs #418 self-review (subideal item #1).
…ugh codec.encode Symmetric to the decode-side guard added in 0574f04. Per-library JSON- with-schema codecs validate inside `encode` (same shape as their `decode` validation per ADR 208 § Case J) and throw the stable code directly. Without this rethrow, `wrapEncodeFailure` re-wraps it as `RUNTIME.ENCODE_FAILED` and the documented stable-code contract breaks on the write side too. Adds a regression test mirroring the decode-side one: a stub codec whose `encode` throws `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`, asserted to pass through `encodeParam` unchanged. Refs #418 self-review (encode-side gap surfaced while writing the e2e schema-rejection test).
The `Embedding.profile` column has been in the e2e fixture since the
arktype-json landing but no test ever wrote or read it — which is why
the runtime cast lookup gap and the JSON.parse decode mismatch slipped
through. Add direct coverage:
1. Happy round-trip — `Embedding.create({ profile: { ... } })` followed
by a where-eq read. Exercises the full encode path (schema validate
→ JSON.stringify → SQL renderer cast lookup → driver write) and the
full decode path (driver read with pre-parsed jsonb → schema validate
→ ORM result).
2. Schema rejection on write — confirms `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`
surfaces unchanged through the encode-side rethrow guard.
Decode-side rejection is already covered by the unit test in
sql-runtime/test/json-schema-validation.test.ts; not duplicated here.
Refs #418 self-review (subideal item #2).
…k fires The README's Compatibility section explained the round-trip invariant and the version pin but didn't say when the runtime defense actually runs. Add one sentence: the factory runs at execution-context construction time (typically `runtime.connect()`), so a stale-but-shape- valid contract.json fails fast at startup rather than on first query or — worse — by silently returning wrong types. Resolves the self-review concern about hard-fail timing: there is no earlier hook to use without architectural changes (the round-trip check requires materializing the schema, which neither paramsSchema validation nor `prisma-next verify` does today). Refs #418 self-review (subideal item #3).
…r flag The runtime registry rejects `forCodecId(codecId)` lookups when multiple distinct resolved instances share a parameterized codec id (e.g. two `vector(N)` columns of different lengths, or two `arktypeJson(...)` columns with different schemas). Today's encode dispatch carries only the codec id (no `(table, column)` ref — TML-2357 § AC-5), so without the rejection a DSL-param could silently bind to the wrong instance. But for codecs whose `encode` is structurally equivalent across all resolved instances (pgvector, arktype-json with schema-independent encode), the rejection is over-conservative: any registered instance encodes correctly, so picking one is safe. Without an opt-out, contracts with multiple parameterized columns of the same codec id break at encode time before `ParamRef.refs` lands. Add `encodeIsParamsIndependent?: boolean` to `CodecDescriptor`. The runtime registry skips the ambiguity-marker for descriptors that opt in. Decode dispatch still uses `forColumn(table, column)` to get the instance-specific schema, so the per-column type check remains correct. The flag becomes vestigial when AC-5 lands. Adds two unit tests in `contract-codec-registry.test.ts`: the default ambiguity rejection (proves the guard still fires) and the `encodeIsParamsIndependent` opt-out path (proves multi-instance contracts work end-to-end at the registry level). Refs Codex review of #418 (F01 enabling change).
…rate pre-parsed JSON primitives Two correctness gaps Codex's review of #418 surfaced: F01 — encode is now params-dependent. Pre-existing on origin/main (`8edf6cde0`): `serializeToJsonSafe` validated against the per-instance schema. With multiple `arktypeJson(...)` columns the resolved codecs differ by reference, the registry's ambiguity guard fires under any codec-id-only encode lookup, and ordinary creates/updates/filters throw `RUNTIME.TYPE_PARAMS_INVALID` before encoding can run. Drop the encode-side `validateSchema(json)` call. Validation runs on decode only, matching the JSON-validator philosophy: payloads can come from any source (this writer, a previous schema version, a manual SQL `INSERT`), so validate when reading. Pair with the new `encodeIsParamsIndependent` descriptor flag (previous commit) — `arktypeJsonCodec` opts in. F02 — pre-parsed JSON string primitives fail decode. The `pg` driver returns `jsonb` cells as already-parsed JS values by default, including primitives. For an arktype `type('string')` schema, a stored `"alice"` arrives as the bare JS string `alice` and `JSON.parse(wire)` throws SyntaxError before the schema gets a chance. Replace the binary `typeof === 'string' ? JSON.parse : passthrough` branch with a `parseWireValue` helper that tries `JSON.parse` first and falls back to the original wire on parse failure. Covers raw-JSON-text drivers, pre-parsed objects/arrays/numbers/bools/null, and pre-parsed JSON string primitives. Documented edge case: a stored JSON string `"123"` arrives pre-parsed as `123`, parses to the number `123` — collision matches what `pgJsonbCodec` does today and is intrinsic to the driver-mode ambiguity. Test changes: - arktype-json-codec.test.ts: encode-validation expectations updated to the round-trip-only contract (existing tests where the runtime check was pinning encode-side schema rejection are reframed as encode normalization tests). Three new F02 tests: pre-parsed string primitive happy path, raw-JSON-text wire still parses, pre-parsed primitive fails schema validation. - e2e arktype-json.test.ts: schema-rejection test rewritten — write with a schema-violating payload commits, the subsequent read throws `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`. Mirrors the new validate-on- read contract. Refs Codex review of #418 (F01, F02).
Three sections still reflected the pre-PR-418 state of the world: - "Type fidelity end-to-end" claimed `vector(1536)` and `arktypeJson(schema)` resolve in the no-emit path. They don't — the resolver wiring is tracked under TML-2357 and the partial- status block earlier in the ADR already says so. Update the consequence to match. - "Encode-side `forCodecId` legacy fallback" said arktype-json's encode is `JSON.stringify` (true) and that the fallback works because encode is per-instance-stateless w.r.t. params. The PR-418 review surfaced that this required keeping encode schema-independent, which is now an explicit design choice. Add the `encodeIsParamsIndependent` flag mechanism that lets multi- column contracts work pre-AC-5. - "Per-library JSON extensions" section said the descriptor's factory "validates internally in `decode`". Make the asymmetry explicit: validation runs on decode only; encode is `JSON.stringify` with no schema check, matching the validate-on-read philosophy. - "Resolves" bullet for TML-2229 claimed both no-emit AND emit paths resolve correctly. Only emit ships in #402; no-emit waits for TML-2357. Split the bullet to reflect the partial state. These updates close Codex's F03. The previously-added partial-status block at § "No-emit type resolution" stays — this commit just brings the rest of the ADR into agreement with it. Refs Codex review of #418 (F03).
…r descriptor Pgvector's wire format `[v1,v2,...]` is dimension-independent. Today the parameterized factory dodges the registry's reference-based ambiguity check by returning `sharedVectorCodec` for every params, so two `vector(N)` columns of different lengths happen to share one codec instance. That keeps multi-vector contracts working without the new `encodeIsParamsIndependent` opt-out — but only by accident of the factory being a constant. The comment on `vectorFactory` already foreshadows a refactor that would close over `params` (e.g. capping wire length to declared dimension). Such a refactor would produce reference-distinct instances and trip the registry's ambiguity guard, breaking multi-vector contracts silently. Declare `encodeIsParamsIndependent: true` on the parameterized vector descriptor to pin the encode-equivalence guarantee at the descriptor level. Future implementation changes either preserve the invariant or remove the flag — the registry will tell you which. Adds a regression test in `manifest.test.ts` asserting the flag is declared. Refs Codex review of #418 (encodeIsParamsIndependent applied beyond arktype-json).
e5ba9f5 to
192a4ed
Compare
The "writes succeed, reads enforce" test failed in CI: I missed that `db.orm.Embedding.create` returns a decoded row via `RETURNING`, so schema enforcement happens during create itself rather than on a subsequent read. There's no single ORM call that cleanly demonstrates the validate-on-read contract end-to-end without raw-SQL bypass infrastructure the fixture doesn't expose. Drop the failing case. Decode-side schema rejection is already covered at: - arktype-json unit tests (decode rejects pre-parsed payloads that violate the schema, decode rejects pre-parsed primitives that violate the schema) - sql-runtime json-schema-validation tests (runtime preserves RUNTIME.JSON_SCHEMA_VALIDATION_FAILED thrown from codec.decode without rewrapping) The happy-path e2e round-trip stays as the single end-to-end smoke test; it exercises the full encode → cast lookup → driver write → driver read (pre-parsed jsonb) → decode pipeline that originally shipped broken in #402.
…gh unchanged `encodeParams` and `decodeRow` already documented "no double wrap" behavior for codec-stamped runtime envelopes, but the catch blocks inside `encodeParamValue` and `decodeField` only special-cased `RUNTIME.JSON_SCHEMA_VALIDATION_FAILED`. Anything else flowed through `wrapEncodeFailure` / `wrapDecodeFailure`, including envelopes a codec body had already stamped — dropping the author-supplied `details`/`cause` contract and producing a misleading double-wrap. Extend both guards to also pass `RUNTIME.ENCODE_FAILED` / `RUNTIME.DECODE_FAILED` through unchanged. The code now matches the existing doc. Adds two regression tests pinning the pass-through: a codec whose `encode` throws a stamped `RUNTIME.ENCODE_FAILED` (with custom `details.codecAuthorContext`) reaches `encodeParam` unchanged, and the symmetric case for `decodeRow`. Refs PR #418 review (CodeRabbit r3181136080).
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e/framework/test/arktype-json.test.ts (1)
2-2: ⚡ Quick winPrefer
patheovernode:pathfor path manipulation.
dirnameandresolveshould be imported frompathe(drop-in replacement with identical API) rather thannode:path. Note:node:urlon line 3 stays —fileURLToPathhas nopatheequivalent.♻️ Proposed change
import { readFile } from 'node:fs/promises'; -import { dirname, resolve } from 'node:path'; +import { dirname, resolve } from 'pathe'; import { fileURLToPath } from 'node:url';As per coding guidelines: "
**/*.{ts,tsx,mts,cts}: Usepatheinstead ofnode:pathfor path manipulation in TypeScript files... especially when code reads files/fixtures from disk."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/framework/test/arktype-json.test.ts` at line 2, Replace imports of dirname and resolve from 'node:path' with imports from 'pathe': update the import statement that currently pulls dirname and resolve from 'node:path' so it instead imports those symbols from 'pathe' (leave fileURLToPath from 'node:url' unchanged); ensure any usages of dirname and resolve remain the same since pathe is a drop-in replacement.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e/framework/test/arktype-json.test.ts`:
- Line 2: Replace imports of dirname and resolve from 'node:path' with imports
from 'pathe': update the import statement that currently pulls dirname and
resolve from 'node:path' so it instead imports those symbols from 'pathe' (leave
fileURLToPath from 'node:url' unchanged); ensure any usages of dirname and
resolve remain the same since pathe is a drop-in replacement.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 6cbb9930-0b82-4bda-95c7-86a13d6d26fe
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (18)
docs/architecture docs/adrs/ADR 208 - Higher-order codecs for parameterized types.mddocs/architecture docs/subsystems/9. No-Emit Workflow.mdpackages/1-framework/1-core/framework-components/src/shared/codec-types.tspackages/2-sql/5-runtime/src/codecs/decoding.tspackages/2-sql/5-runtime/src/codecs/encoding.tspackages/2-sql/5-runtime/src/sql-context.tspackages/2-sql/5-runtime/test/contract-codec-registry.test.tspackages/2-sql/5-runtime/test/json-schema-validation.test.tspackages/3-extensions/arktype-json/README.mdpackages/3-extensions/arktype-json/package.jsonpackages/3-extensions/arktype-json/src/core/arktype-json-codec.tspackages/3-extensions/arktype-json/src/exports/runtime.tspackages/3-extensions/arktype-json/test/arktype-json-codec.test-d.tspackages/3-extensions/arktype-json/test/arktype-json-codec.test.tspackages/3-extensions/arktype-json/test/extension-descriptors.test.tspackages/3-extensions/pgvector/src/exports/runtime.tspackages/3-extensions/pgvector/test/manifest.test.tstest/e2e/framework/test/arktype-json.test.ts
✅ Files skipped from review due to trivial changes (7)
- packages/3-extensions/pgvector/src/exports/runtime.ts
- packages/3-extensions/pgvector/test/manifest.test.ts
- packages/3-extensions/arktype-json/package.json
- packages/1-framework/1-core/framework-components/src/shared/codec-types.ts
- packages/3-extensions/arktype-json/README.md
- packages/3-extensions/arktype-json/test/arktype-json-codec.test.ts
- docs/architecture docs/subsystems/9. No-Emit Workflow.md
🚧 Files skipped from review as they are similar to previous changes (6)
- packages/2-sql/5-runtime/test/contract-codec-registry.test.ts
- packages/3-extensions/arktype-json/test/arktype-json-codec.test-d.ts
- packages/3-extensions/arktype-json/test/extension-descriptors.test.ts
- packages/3-extensions/arktype-json/src/core/arktype-json-codec.ts
- packages/2-sql/5-runtime/src/sql-context.ts
- packages/2-sql/5-runtime/src/codecs/encoding.ts
Focused follow-up to #402. This PR fixes the arktype-json runtime path that ADR 208 exposed: Postgres writes need runtime cast metadata, Postgres reads need to tolerate driver-preparsed
jsonb, codec errors need to preserve stable runtime envelopes, and the docs need to say what no-emit can actually type today.Changes
stringtostring | JsonValueand normalized both raw JSON text and driver-preparsed JSONB values on decode. The previousJSON.parse(wire)assumption was wrong becausepgreturnsjsonbas JavaScript values by default. Decode anddecodeJsonare now the schema-validation boundary; encode intentionally staysJSON.stringifyonly, because schema-validating on encode would make the codec parameter-dependent while params still carry onlycodecId.types.codecTypes.codecInstances, and gave itmeta.db.sql.postgres.nativeType = 'jsonb'. The old "emit-only" shape was incomplete: the Postgres renderer reads codec metadata at runtime to decide whether$N::jsonbcasts are required, so the first arktype-json write or filter could fail before codec dispatch even ran.CodecDescriptor.encodeIsParamsIndependentand taught the runtime codec registry to keep rejecting ambiguous multi-instanceforCodecId(codecId)lookups unless the descriptor declares that encode output is equivalent across all resolved parameter sets. The old ambiguity guard was right for parameter-dependent encoders, but too coarse forarktype-jsonandpgvector; both now declare the invariant explicitly, while decode still usesforColumnfor per-column schemas.RUNTIME.JSON_SCHEMA_VALIDATION_FAILEDand already-stampedRUNTIME.ENCODE_FAILED/RUNTIME.DECODE_FAILEDenvelopes. Wrapping every codec error was wrong because it hid stable schema-failure codes and double-wrapped errors whose details were already authored by the codec.RUNTIME.TYPE_PARAMS_INVALIDand pinnedarktypeto~2.1.29. A warning was too late: ifark.schema(jsonIr).expressiondiverges from serializedtypeParams.expression,contract.d.tsmay already be stale relative to the runtime schema. Failing at codec materialization is the more honest behavior.vector(1536)becomesnumber[],arktypeJson(schema)becomesunknown); precise schema-shaped output comes from emittedcontract.d.tsuntil TML-2357 lands.jsonbcast metadata, pre-parsed JSONB objects and string primitives, expression mismatch hard-fail, params-independent encode fallback, runtime error pass-through, and theEmbedding.profileORM round trip through Postgres.Why
The previous implementation mixed up control-plane and runtime responsibilities. It carried enough metadata for emission, but not enough metadata for Postgres lowering; it documented the intended no-emit resolver, but the resolver still uses codec base outputs; and it assumed the driver returned JSONB as strings, even though the default
pgbehavior is the opposite.This version keeps those boundaries explicit. Encode stays parameter-independent until
ParamRef.refscan identify the target column. Decode validates the schema because stored JSON can come from this writer, another service, manual SQL, or an older schema. Runtime metadata lives where the renderer actually reads it. The docs now distinguish shipped behavior from the remaining TML-2357 work.