Skip to content

Commit 0d8e3a8

Browse files
lukewagnerscott-wagner
authored andcommitted
Refine reentrance check to handle import/export forwarding
1 parent d336ed6 commit 0d8e3a8

File tree

4 files changed

+206
-65
lines changed

4 files changed

+206
-65
lines changed

design/mvp/CanonicalABI.md

Lines changed: 132 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,11 @@ class Store:
130130
def __init__(self):
131131
self.pending = []
132132

133-
def invoke(self, f: FuncInst, caller, on_start, on_resolve) -> Call:
134-
return f(caller, on_start, on_resolve)
133+
def invoke(self, f: FuncInst, caller: Optional[Supertask], on_start, on_resolve) -> Call:
134+
host_caller = Supertask()
135+
host_caller.inst = None
136+
host_caller.supertask = caller
137+
return f(host_caller, on_start, on_resolve)
135138

136139
def tick(self):
137140
random.shuffle(self.pending)
@@ -167,7 +170,7 @@ OnStart = Callable[[], list[any]]
167170
OnResolve = Callable[[Optional[list[any]]], None]
168171

169172
class Supertask:
170-
inst: ComponentInstance
173+
inst: Optional[ComponentInstance]
171174
supertask: Optional[Supertask]
172175

173176
class Call:
@@ -190,6 +193,14 @@ However, as described in the [concurrency explainer], an async call's
190193
(currently) that the caller can know or do about it (hence there are
191194
currently no other methods on `Call`).
192195

196+
The optional `Supertask.inst` field either points to the `ComponentInstance`
197+
containing the supertask or, if `None`, indicates that the supertask is a host
198+
function. Because `Store.invoke` unconditionally appends a host `Supertask`,
199+
every callstack is rooted by a host `Supertask`. There is no prohibition on
200+
component-to-host-to-component calls (as long as the recursive call condition
201+
checked by `call_is_recursive` are satisfied) and thus host `Supertask`s may
202+
also appear anywhere else in the callstack.
203+
193204

194205
## Supporting definitions
195206

@@ -280,24 +291,124 @@ behavior and enforce invariants.
280291
```python
281292
class ComponentInstance:
282293
store: Store
294+
parent: Optional[ComponentInstance]
283295
table: Table
284296
may_leave: bool
285297
backpressure: int
286298
exclusive: bool
287299
num_waiting_to_enter: int
288300

289-
def __init__(self, store):
301+
def __init__(self, store, parent = None):
302+
assert(parent is None or parent.store is store)
290303
self.store = store
304+
self.parent = parent
291305
self.table = Table()
292306
self.may_leave = True
293307
self.backpressure = 0
294308
self.exclusive = False
295309
self.num_waiting_to_enter = 0
296310
```
297311
Components are always instantiated in the context of a `Store` which is saved
298-
immutably in the `store` field. The other fields are described below as they
299-
are used.
300-
312+
immutably in the `store` field.
313+
314+
If a component is instantiated by an `instantiate` expression in a "parent"
315+
component, the parent's `ComponentInstance` is immutably saved in the `parent`
316+
field of the child's `ComponentInstance`. If instead a component is
317+
instantiated directly by the host, the `parent` field is `None`. Thus, the set
318+
of component instances in a store forms a forest rooted by the component
319+
instances that were instantiated directly by the host.
320+
321+
How the host instantiates and invokes root components is up to the host and not
322+
specified by the Component Model. Exports of previously-instantiated root
323+
components *may* be supplied as the imports of subsequently-instantiated root
324+
components. Due to the ordered nature of instantiation, root components cannot
325+
directly import each others' exports in cyclic manner. However, the host *may*
326+
perform cyclic component-to-host-to-component calls, in the same way that a
327+
parent component can use `call_indirect` and a table of mutable `funcref`s to
328+
make cyclic child-to-parent-to-child calls.
329+
330+
Because a child component is fully encapsulated by its parent component (with
331+
all child imports specified by the parent's `instantiate` expression and access
332+
to all child exports controlled by the parent through its private instance index
333+
space), the host does not have direct control over how a child component is
334+
instantiated or invoked. However, if a child's ancestors transitively forward
335+
the root component's host-supplied imports to the child, direct child-to-host
336+
calls are possible. Symmetrically, if a child's ancestors transitively
337+
re-export the child's exports from the root component, direct host-to-child
338+
calls are possible. Consequently, direct calls between child components of
339+
distinct parent components are also possible.
340+
341+
As mentioned above, cyclic calls between components are made possible by
342+
indirecting through a parent component or the host. However, for the time
343+
being, a "recursive" call in which a single component instance is entered
344+
multiple times on the same `Supertask` callstack is well-defined to trap upon
345+
attempted reentry. There are several reasons for this trapping behavior:
346+
* automatic [backpressure] would otherwise deadlock in unpredictable and
347+
surprising ways;
348+
* by default, most code does not expect [recursive reentrance] and will break
349+
in subtle and potentially security sensitive ways if allowed;
350+
* to properly handle recursive reentrance, an extra ABI parameter is required
351+
to link recursive calls on the same stack and this requires opting in via
352+
some [TBD](Concurrency.md#TODO) function effect type or canonical ABI option
353+
354+
The `call_is_recursive` predicate is used by `canon_lift` and
355+
`canon_resource_drop` (defined below) to detect recursive reentrance and
356+
subsequently trap. The supporting `ancestors` function enumerates all
357+
transitive parents of a node, *including the node itself*, in a Python `set`,
358+
thereby allowing set-wise union (`|`), intersection (`&`) and difference (`-`).
359+
```python
360+
def call_is_recursive(caller: Supertask, callee_inst: ComponentInstance):
361+
callee_insts = { callee_inst } | (ancestors(callee_inst) - ancestors(caller.inst))
362+
while caller is not None:
363+
if callee_insts & ancestors(caller.inst):
364+
return True
365+
caller = caller.supertask
366+
return False
367+
368+
def ancestors(inst: Optional[ComponentInstance]) -> set[ComponentInstance]:
369+
s = set()
370+
while inst is not None:
371+
s.add(inst)
372+
inst = inst.parent
373+
return s
374+
```
375+
The `callee_insts` set contains all the component instances being freshly
376+
entered by the call, always including the `callee_inst` itself. The subsequent
377+
loop then tests whether *any* of the `callee_insts` is already on the stack.
378+
This set-wise definition considers cases like the following to be recursive:
379+
```
380+
+-------+
381+
| A |<-.
382+
| +---+ | |
383+
--->| B |----'
384+
| +---+ |
385+
+-------+
386+
```
387+
At the point when recursively calling back into `A`, `callee_inst` is `A`
388+
and `caller` points to the following stack:
389+
```
390+
caller --> |inst=None| --supertask--> |inst=B| --supertask--> |inst=None| --supertask--> None
391+
```
392+
while `A` does not appear as the `inst` of any `Supertask` on this stack,
393+
`callee_insts` is `{ A }` and `ancestors(B)` is `{ B, A }`, so the second iteration
394+
of the loop sees a non-empty intersection and correctly determines that `A` is
395+
being reentered.
396+
397+
An optimizing implementation can avoid the overhead of sets and loops in
398+
several ways:
399+
* In the quite-common case that a component does not contain *both* core module
400+
instances *and* component instances, inter-component recursion is not possible
401+
and can thus be statically eliminated from the generated inter-component
402+
trampolines.
403+
* If the runtime imposes a modest per-store upper-bound on the number of
404+
component instances, like 64, then an `i64` can be used to represent the
405+
`set[ComponentInstance]`, assigning each component instance a bit. Then,
406+
the `i64` representing the transitive union of all `supertask`'s
407+
`ancestor(inst)`s can be propagated from caller to callee, allowing the
408+
`while` loop to be replaced by a single bitwise-and of the callee's
409+
`i64` with the transitive callers' `i64`.
410+
411+
The other fields of `ComponentInstance` are described below as they are used.
301412

302413
#### Table State
303414

@@ -804,7 +915,7 @@ class Task(Call, Supertask):
804915
opts: CanonicalOptions
805916
inst: ComponentInstance
806917
ft: FuncType
807-
supertask: Optional[Task]
918+
supertask: Supertask
808919
on_resolve: OnResolve
809920
num_borrows: int
810921
threads: list[Thread]
@@ -838,37 +949,6 @@ called (by the `Task.return_` and `Task.cancel` methods, defined below).
838949
assert(self.num_borrows == 0)
839950
```
840951

841-
The `Task.trap_if_on_the_stack` method checks for unintended reentrance,
842-
enforcing a [component invariant]. This guard uses the `Supertask` defined by
843-
the [Embedding](#embedding) interface to walk up the async call tree defined as
844-
part of [structured concurrency]. The async call tree is necessary to
845-
distinguish between the deadlock-hazardous kind of reentrance (where the new
846-
task is a transitive subtask of a task already running in the same component
847-
instance) and the normal kind of async reentrance (where the new task is just a
848-
sibling of any existing tasks running in the component instance). Note that, in
849-
the [future](Concurrency.md#TODO), there will be a way for a function to opt in
850-
(via function type attribute) to the hazardous kind of reentrance, which will
851-
nuance this test.
852-
```python
853-
def trap_if_on_the_stack(self, inst):
854-
c = self.supertask
855-
while c is not None:
856-
trap_if(c.inst is inst)
857-
c = c.supertask
858-
```
859-
An optimizing implementation can avoid the O(n) loop in `trap_if_on_the_stack`
860-
in several ways:
861-
* Reentrance by a child component can (often) be statically ruled out when the
862-
parent component doesn't both lift and lower the child's imports and exports
863-
(i.e., "donut wrapping").
864-
* Reentrance of the root component by the host can either be asserted not to
865-
happen or be tracked in a per-root-component-instance flag.
866-
* When a potentially-reenterable child component only lifts and lowers
867-
synchronously, reentrance can be tracked in a per-component-instance flag.
868-
* For the remaining cases, the live instances on the stack can be maintained in
869-
a packed bit-vector (assigning each potentially-reenterable async component
870-
instance a static bit position) that is passed by copy from caller to callee.
871-
872952
The `Task.needs_exclusive` predicate returns whether the Canonical ABI options
873953
indicate that the core wasm being executed does not expect to be reentered
874954
(e.g., because the code is using a single global linear memory shadow stack).
@@ -3161,8 +3241,8 @@ Based on this, `canon_lift` is defined in chunks as follows, starting with how
31613241
a `lift`ed function starts executing:
31623242
```python
31633243
def canon_lift(opts, inst, ft, callee, caller, on_start, on_resolve) -> Call:
3244+
trap_if(call_is_recursive(caller, inst))
31643245
task = Task(opts, inst, ft, caller, on_resolve)
3165-
task.trap_if_on_the_stack(inst)
31663246
def thread_func(thread):
31673247
if not task.enter(thread):
31683248
return
@@ -3176,16 +3256,16 @@ def canon_lift(opts, inst, ft, callee, caller, on_start, on_resolve) -> Call:
31763256
flat_ft = flatten_functype(opts, ft, 'lift')
31773257
assert(types_match_values(flat_ft.params, flat_args))
31783258
```
3179-
Each call starts by immediately checking for unexpected reentrance using
3180-
`Task.trap_if_on_the_stack`.
3259+
Each lifted function call starts by immediately trapping on recursive
3260+
reentrance (as defined by `call_is_recursive` above).
31813261

31823262
The `thread_func` is immediately called from a new `Thread` created and resumed
3183-
at the end of `canon_lift` and so control flow proceeds directly from the
3184-
`trap_if_on_stack` to the `enter`. `Task.enter` (defined above) suspends the
3185-
newly-created `Thread` if there is backpressure until the backpressure is
3186-
resolved. If the caller cancels the new `Task` while the `Task` is still
3187-
waiting to `enter`, the call is aborted before the arguments are lowered (which
3188-
means that owned-handle arguments are not transferred).
3263+
at the end of `canon_lift` and so control flow proceeds directly to the `enter`.
3264+
`Task.enter` (defined above) suspends the newly-created `Thread` if there is
3265+
backpressure until the backpressure is resolved. If the caller cancels the new
3266+
`Task` while the `Task` is still waiting to `enter`, the call is aborted before
3267+
the arguments are lowered (which means that owned-handle arguments are not
3268+
transferred).
31893269

31903270
Once the backpressure gate is cleared, the `Thread` is added to the callee's
31913271
component instance's table (storing the index for later retrieval by the
@@ -3570,7 +3650,7 @@ def canon_resource_drop(rt, thread, i):
35703650
callee = partial(canon_lift, callee_opts, rt.impl, ft, rt.dtor)
35713651
[] = canon_lower(caller_opts, ft, callee, thread, [h.rep])
35723652
else:
3573-
thread.task.trap_if_on_the_stack(rt.impl)
3653+
trap_if(call_is_recursive(thread.task, rt.impl))
35743654
else:
35753655
h.borrow_scope.num_borrows -= 1
35763656
return []
@@ -3587,9 +3667,9 @@ reentrance guard of `Task.enter`, an exception is made when the resource type's
35873667
implementation-instance is the same as the current instance (which is
35883668
statically known for any given `canon resource.drop`).
35893669

3590-
When a destructor isn't present, the rules still perform a reentrance check
3670+
When a destructor isn't present, there is still a trap on recursive reentrance
35913671
since this is the caller's responsibility and the presence or absence of a
3592-
destructor is an encapsualted implementation detail of the resource type.
3672+
destructor is an encapsulated implementation detail of the resource type.
35933673

35943674

35953675
### `canon resource.rep`
@@ -4807,6 +4887,7 @@ def canon_thread_available_parallelism():
48074887
[Concurrency Explainer]: Concurrency.md
48084888
[Suspended]: Concurrency#thread-built-ins
48094889
[Structured Concurrency]: Concurrency.md#subtasks-and-supertasks
4890+
[Recursive Reentrance]: Concurrency.md#subtasks-and-supertasks
48104891
[Backpressure]: Concurrency.md#backpressure
48114892
[Current Thread]: Concurrency.md#current-thread-and-task
48124893
[Current Task]: Concurrency.md#current-thread-and-task

design/mvp/Explainer.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2870,7 +2870,7 @@ three runtime invariants:
28702870
component instance.
28712871
2. The Component Model disallows reentrance by trapping if a callee's
28722872
component-instance is already on the stack when the call starts.
2873-
(For details, see [`trap_if_on_the_stack`](CanonicalABI.md#task-state)
2873+
(For details, see [`call_is_recursive`](CanonicalABI.md#component-instance-state)
28742874
in the Canonical ABI explainer.) This default prevents obscure
28752875
composition-time bugs and also enables more-efficient non-reentrant
28762876
runtime glue code. This rule will be relaxed by an opt-in

design/mvp/canonical-abi/definitions.py

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -189,8 +189,11 @@ class Store:
189189
def __init__(self):
190190
self.pending = []
191191

192-
def invoke(self, f: FuncInst, caller, on_start, on_resolve) -> Call:
193-
return f(caller, on_start, on_resolve)
192+
def invoke(self, f: FuncInst, caller: Optional[Supertask], on_start, on_resolve) -> Call:
193+
host_caller = Supertask()
194+
host_caller.inst = None
195+
host_caller.supertask = caller
196+
return f(host_caller, on_start, on_resolve)
194197

195198
def tick(self):
196199
random.shuffle(self.pending)
@@ -205,7 +208,7 @@ def tick(self):
205208
OnResolve = Callable[[Optional[list[any]]], None]
206209

207210
class Supertask:
208-
inst: ComponentInstance
211+
inst: Optional[ComponentInstance]
209212
supertask: Optional[Supertask]
210213

211214
class Call:
@@ -252,20 +255,38 @@ class CanonicalOptions(LiftLowerOptions):
252255

253256
class ComponentInstance:
254257
store: Store
258+
parent: Optional[ComponentInstance]
255259
table: Table
256260
may_leave: bool
257261
backpressure: int
258262
exclusive: bool
259263
num_waiting_to_enter: int
260264

261-
def __init__(self, store):
265+
def __init__(self, store, parent = None):
266+
assert(parent is None or parent.store is store)
262267
self.store = store
268+
self.parent = parent
263269
self.table = Table()
264270
self.may_leave = True
265271
self.backpressure = 0
266272
self.exclusive = False
267273
self.num_waiting_to_enter = 0
268274

275+
def call_is_recursive(caller: Supertask, callee_inst: ComponentInstance):
276+
callee_insts = { callee_inst } | (ancestors(callee_inst) - ancestors(caller.inst))
277+
while caller is not None:
278+
if callee_insts & ancestors(caller.inst):
279+
return True
280+
caller = caller.supertask
281+
return False
282+
283+
def ancestors(inst: Optional[ComponentInstance]) -> set[ComponentInstance]:
284+
s = set()
285+
while inst is not None:
286+
s.add(inst)
287+
inst = inst.parent
288+
return s
289+
269290
#### Table State
270291

271292
class Table:
@@ -534,7 +555,7 @@ class State(Enum):
534555
opts: CanonicalOptions
535556
inst: ComponentInstance
536557
ft: FuncType
537-
supertask: Optional[Task]
558+
supertask: Supertask
538559
on_resolve: OnResolve
539560
num_borrows: int
540561
threads: list[Thread]
@@ -560,12 +581,6 @@ def thread_stop(self, thread):
560581
trap_if(self.state != Task.State.RESOLVED)
561582
assert(self.num_borrows == 0)
562583

563-
def trap_if_on_the_stack(self, inst):
564-
c = self.supertask
565-
while c is not None:
566-
trap_if(c.inst is inst)
567-
c = c.supertask
568-
569584
def needs_exclusive(self):
570585
return not self.opts.async_ or self.opts.callback
571586

@@ -1984,8 +1999,8 @@ def lower_flat_values(cx, max_flat, vs, ts, out_param = None):
19841999
### `canon lift`
19852000

19862001
def canon_lift(opts, inst, ft, callee, caller, on_start, on_resolve) -> Call:
2002+
trap_if(call_is_recursive(caller, inst))
19872003
task = Task(opts, inst, ft, caller, on_resolve)
1988-
task.trap_if_on_the_stack(inst)
19892004
def thread_func(thread):
19902005
if not task.enter(thread):
19912006
return
@@ -2167,7 +2182,7 @@ def canon_resource_drop(rt, thread, i):
21672182
callee = partial(canon_lift, callee_opts, rt.impl, ft, rt.dtor)
21682183
[] = canon_lower(caller_opts, ft, callee, thread, [h.rep])
21692184
else:
2170-
thread.task.trap_if_on_the_stack(rt.impl)
2185+
trap_if(call_is_recursive(thread.task, rt.impl))
21712186
else:
21722187
h.borrow_scope.num_borrows -= 1
21732188
return []

0 commit comments

Comments
 (0)