Thoughts on Assembling a Zero-Copy Cross-Language Object Model

This issue lifts the inlined accessor pattern proposed in #626 to its broader context, exploring how it fits together with other proposals currently in flight to complete a _zero-copy shared-interface cross-language object-model_.

### The cross-language object-model problem

A producer holds objects in some language-specific representation — Java instances, OCaml records, JS hidden-class objects, Lua tables, Rust structs. A consumer in a possibly different language wants to access state on those objects. The Canonical ABI today copies values across the boundary, which is correct under shared-nothing but expensive when the values are non-trivial. Two situations make the copy especially costly:

1. **Producer's layout is fixed by source semantics.** JS hidden classes, Lua tables, Python instances. The compiler can't conform the at-rest representation to whatever shape the canonical ABI expects without changing the language's runtime model.
2. **Producer and consumer have different layouts even when each could control its own.** A Java producer and an OCaml consumer both compile to Wasm GC, both pick at-rest representations idiomatic for their language, and the representations don't structurally match.

### Layering two complementary strategies

The proposals currently in flight cover this through two strategies organized along the layout-aligned vs. layout-mismatched axis:

**Typed reference passing** — for the layout-aligned case: data is exposed as a `record` or `list` plus associated operations as free functions in the same interface, and #525's lowering passes the data as a typed `(ref $T)` reference. Field access on the consumer side compiles directly to `struct.get` and `struct.set` — no method calls, no inlining required. (the current pre-proposal form is primarily scoped to records and lists with structural Wasm GC correspondence; variants, strings, and nested mutable cases remain partly open.)

**Accessor-based passing** — for the layout-mismatched case: data and associated operations are bundled into a `resource` with methods on it. The consumer accesses everything through method calls on an externref. With monomorphic dispatch and inlining, accessor calls reduce to direct memory operations on the producer's at-rest representation, maintaining isolation with access and ownership contracts and static or dynamic bound checks in case of indexed access.

The two strategies aren't competing — they answer different questions, and the choice between them combines two orthogonal dimensions: a modeling decision (separate record + free functions vs. nominal resource bundling state and methods) and a layout decision (whether the producer's at-rest representation aligns with the structural representation the consumer expects). The two strategies also sit at different levels of abstraction: typed reference passing is a concrete lowering for the layout-aligned sub-case, while accessor-based passing is a general lowering whose accessor bodies can wrap arbitrary internal representations — including a typed reference passed via #525 where applicable. Typed reference passing achieves zero-copy when both modeling and layout fit; accessor-based passing achieves zero-copy whenever the resource model fits, regardless of layout alignment, because the layout is hidden behind the accessor interface.

In practice, resource-with-accessors is the right choice when either dimension favors it: when the producer's representation is fixed by source semantics (JS hidden classes, Lua tables, Python instances), when layouts differ between sides and can't be aligned, or when the producer's interface design naturally bundles state with operations and benefits from encapsulation.

### Four layers slicing the concerns

Looking at it, there are four layers answering one question, each composable independently:

| Layer | Question | Mechanism |
|-------|----------|-----------|
| **1. WIT type semantics** | What is the value? | `record`, `list`, `variant`, `resource` — defining identity, opacity, allowed operations |
| **2. Read/write permissions** | Which operations are exposed? | Producer's choice of accessors; for resources, the methods declared in WIT |
| **3. Ownership annotations** | Who holds rights at the boundary? | `borrow<T>`, `own<T>`, by-value default — specifying lifetime semantics |
| **4. Layout-mapping mechanism** | How to bridge different representations? | Either typed reference passing for layout-aligned, or accessor calls for layout-mismatched |

The interesting observation: each layer has one or two clear answers among existing/in-flight proposals, and they compose freely. A `record` with `borrow<T>` and #525-style typed reference passing could be one valid combination. A `resource` with `borrow<T>` and accessor-based passing is another. The architecture seems already to be there — just not written down as a connected picture.

### Concrete example

The same logical `image` expressed under both strategies — note the symmetry: both define data plus associated operations, but organize them differently.

```wit
// Strategy 1 (layout-aligned): record + free functions, passed by typed reference
record image {
    width: u32,
    height: u32,
    pixels: list<u32>,
}

pixel-at: func(img: image, x: u32, y: u32) -> u32;
blend: func(a: image, b: image) -> image;
resize: func(img: image, new-width: u32, new-height: u32) -> image;

// Strategy 2 (layout-mismatched): resource bundling state and operations
// (using proposed WIT sugar for field-like accessors on resources;
//  desugars to explicit getter/setter methods in current WIT)
resource image {
    constructor(width: u32, height: u32);

    readonly width: u32;
    readonly height: u32;

    pixel-at: func(x: u32, y: u32) -> u32;
    set-pixel-at: func(x: u32, y: u32, value: u32);
    blit: func(dst-x: u32, dst-y: u32, src: borrow<image>, src-x: u32, src-y: u32, w: u32, h: u32);
    blend-with: func(other: borrow<image>) -> own<image>;
}
```

In Strategy 1, field access on an `image` reference compiles to direct `struct.get` / `array.get` operations under #525's lowering; free functions like `blend` or `resize` operate on these references and return new ones, consistent with value-type semantics. 

In Strategy 2, each accessor is a method call that, with inlining, reduces to the same direct memory operations — with static or dynamic bounds checks for indexed access (`pixel-at`, `set-pixel-at`). What that method body contains is the producer's choice: an automatically generated trivial getter or setter, or a hand-written method performing validation, conversion, or lazy computation — consumers see only the declared interface in either case. The `readonly` modifier on `width` and `height` desugars to a getter only; ownership annotations on parameters (`borrow<image>`, `own<image>`) follow the existing WIT semantics for resources. The `blit` method is an example of a bulk operation — see the next section.

Both forms describe the same logical `image` but with different modeling commitments — record + free functions favors immutable transformations and value semantics; resource favors identity and in-place mutation behind an interface. The lowering choice (Layer 4) is then determined by whether the producer's at-rest representation can be brought into structural alignment with the consumer (Strategy 1), or whether it's fixed or differs and accessors are needed (Strategy 2).

### The layout-mismatched path

The accessor pattern works mechanically today. But the zero-cost property — that an accessor call reduces to a direct `struct.get` — depends on guaranteed inlining. [Wasmtime's recently-added function inliner](https://bytecodealliance.org/articles/inliner) provides this on the engine side. Browser engines and embedded baseline tiers don't reliably inline cross-instance calls, so the pattern remains hypothetical for those targets.

#626 fills this gap on the toolchain side: at link time, component-linking tools can inline accessor calls into the merged module while preserving inter-module isolation through multi-memory partitioning. Together with engine-side inlining where available, this makes the layout-mismatched strategy deployable across the full target spectrum offering guaranteed inlining of accessors — a cornerstone to make this model portable across the diverse spectrum of runtimes.

### Bulk operations

Per-element accessor calls are efficient for scalar access (single field read/write, individual indexed lookup). For bulk operations on many elements — image filters traversing all pixels, string scanning, list aggregations — per-element calls accumulate overhead even after inlining: each call is an indirect jump, vectorization opportunities are lost, and bounds checks repeat per iteration. The accessor-based strategy addresses this through three complementary paths:

1. **Explicit bulk methods on the resource** — the producer exposes memcpy-style or batch operations directly on the resource (the `blit` method in the example above is this kind). With inlining, the body can become a single `memory.copy` or a tight optimized loop. This works today within existing WIT semantics; the producer chooses which bulk operations to expose.

2. **Region borrows** — a typed reference to a bounded memory region that the consumer operates on with normal Wasm bulk operations (`memory.copy`, SIMD load/store) after a single bounds check at borrow acquisition. This is the direction of @lukewagner's lazy-lowering proposal #383 and of #568 (mappableref) — a native primitive for region/buffer borrows that handles bulk-region semantics directly.

3. **Optimizer-recovery from per-element loops** — recognizing sequential accessor patterns and lowering them to vectorized or bulk operations. With aggressive inlining (Wasmtime's inliner, toolchain-side merge+wasm-opt) this is in principle possible at the engine level, though dependent on optimizer sophistication. The same link-time module-rewriting mechanism described above could implement this as a dedicated pass — pattern-matching accessor loops and emitting bulk operations deterministically rather than relying on engine heuristics.

The three paths fit different situations: bulk methods when the producer can anticipate which bulk operations matter; region borrows when the consumer needs flexibility on a producer-exposed buffer; optimizer-recovery when neither side anticipated the pattern but the access is sequential.

### Open question

The picture above seems coherent: two complementary lowering strategies along the layout-aligned vs. layout-mismatched axis, sitting at different levels of abstraction — typed reference passing as the concrete fast path for layout-aligned cases, accessor-based passing as the general lowering that covers the rest and can wrap typed references where applicable. The four-layer separation makes the composition explicit.

The question worth surfacing: is this an architecture the perspectives represented in the CG align with? If the picture is correct, a non-normative note in the spec or Component Model Explainer would give compiler engineers targeting WIT for new languages a clear map — currently the recipes live in pieces across `resource` documentation, #525, #626, and #568, and there's no single place that says "here is the cross-language object-model architecture, here is which strategy applies to your language's situation."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thoughts on Assembling a Zero-Copy Cross-Language Object Model #644

The cross-language object-model problem

Layering two complementary strategies

Four layers slicing the concerns

Concrete example

The layout-mismatched path

Bulk operations

Open question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Layer	Question	Mechanism
1. WIT type semantics	What is the value?	`record`, `list`, `variant`, `resource` — defining identity, opacity, allowed operations
2. Read/write permissions	Which operations are exposed?	Producer's choice of accessors; for resources, the methods declared in WIT
3. Ownership annotations	Who holds rights at the boundary?	`borrow<T>`, `own<T>`, by-value default — specifying lifetime semantics
4. Layout-mapping mechanism	How to bridge different representations?	Either typed reference passing for layout-aligned, or accessor calls for layout-mismatched

Thoughts on Assembling a Zero-Copy Cross-Language Object Model #644

Description

The cross-language object-model problem

Layering two complementary strategies

Four layers slicing the concerns

Concrete example

The layout-mismatched path

Bulk operations

Open question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions