Add replace_with_shallow to LoroDoc #896

canadaduane · 2026-01-10T05:05:52Z

This PR adds replace_with_shallow, a way to modify a LoroDoc and "trim in place", discarding old operations up to a Frontier.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4f4c994051

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-01-10T05:11:23Z

crates/loro-wasm/src/lib.rs

+    /// Replace the current document state with a shallow snapshot at the given frontiers.
+    ///
+    /// This method trims the history in place, preserving subscriptions and configuration.
+    #[wasm_bindgen(js_name = "replaceWithShallow")]
+    pub fn replace_with_shallow(&self, frontiers: Vec<JsID>) -> JsResult<()> {


Flush pending WASM events for replaceWithShallow

This new WASM-exposed method mutates document state and can enqueue subscription callbacks, but it isn’t added to the JS-side decorateMethods allowlist that calls callPendingEvents() in a finally block. With active subscriptions, calling replaceWithShallow can therefore trigger [LORO_INTERNAL_ERROR] Event not called and drop callbacks until another decorated method is invoked. Please add the JS name (replaceWithShallow) to the allowlist in crates/loro-wasm/index.ts to preserve the pending-event invariant.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-01-10T05:11:23Z

crates/loro-internal/src/loro.rs

+            let mut state = self.state.lock().unwrap();
+            let mut new_state = temp_doc.state.lock().unwrap();
+
+            std::mem::swap(&mut state.frontiers, &mut new_state.frontiers);
+            std::mem::swap(&mut state.store, &mut new_state.store);


Keep ContainerStore arena aligned after replace_with_shallow

replace_with_shallow swaps state.store with the temporary document’s store but leaves state.arena untouched, after already swapping inner arena contents between self.arena and new_arena. Since ContainerStore owns its own SharedArena, the doc now has two different arena instances (state.arena vs state.store.arena). Subsequent container registrations and lookups can update only one arena, causing container index mismatches and incorrect resolution after replace_with_shallow (e.g., creating new containers or resolving paths). Consider swapping/updating the store’s arena to match state.arena or rebuilding the store in-place.

Useful? React with 👍 / 👎.

canadaduane

Added a couple of explanatory comments.

canadaduane · 2026-01-10T05:11:53Z

crates/loro/tests/loro_rust_test.rs

 }

 #[test]
+#[serial_test::serial]


I added this because the memory test was failing when run in parallel, due to memory allocated by parallel tests.

canadaduane · 2026-01-10T05:13:58Z

crates/loro-internal/src/loro.rs

+    ///   and unreliable for same-thread lock detection
+    /// - Internal callers in `shallow_snapshot.rs` legitimately call this without
+    ///   holding the txn lock directly, but are protected at the public API level
+    ///   by `with_barrier()` in methods like `export()`


Please note: this comment needs review. It was generated by Claude, and I believe it is correct, but without expertise in this library I am not 100% sure it is correct. The reason for removing the assert!(self.txn.is_locked()) assertion below is that _checkout_without_emitting does not need to be called within a lock if it is during replace_with_shallow.

zxch3n · 2026-01-10T05:31:00Z

crates/loro-internal/src/loro.rs

        self._apply_diff(diff, &mut Default::default(), false)
    }

+    pub fn replace_with_shallow(&self, frontiers: &Frontiers) -> LoroResult<()> {


My idea for this part of the implementation is that internally we can first directly call the ExposedShallowSnapshot API to export a snapshot, and then import it into a new document. The new document can be created directly via the FromSnapshot method.

The subsequent steps would include:

Migrate metadata and state:

(a) Migrate subscriptions, events, and similar components from the old document to the new one.

(b) Assign the Peer ID to the new document.

(c) Migrate all relevant state to the new document.

Replace pointers:

(a) After the state migration is complete, directly replace the old document’s internal pointer with the pointer from the new document.

(b) From that point on, all state references reuse the new document, and the old internal logic of the original document can be entirely discarded.

The benefits of this approach are:

Stronger overall correctness guarantees: we don’t need to introduce extensive internal logic changes.

Reduced maintenance burden: if we modify too much internal logic, every future optimization would require re-examining this area to ensure it hasn’t been broken by new changes.

Built on public APIs: most of the logic in this approach is based on public interfaces.

At the moment, the only real risk lies in the step mentioned above that migrates existing events. To handle this, we would need an additional API, something like replace_doc_inplace. As long as we can guarantee the correctness of this function, we can guarantee the correctness of the entire feature.

Overall, this approach should be more stable and easier to use, and it also gives us a function that we can reuse in the future.

Correctness is the highest priority, so I will take your advice. But I worry a little bit about speed--is creating a new document going to be as fast as mem::swap on the arena?

An additional quirk to address: my understanding of the Subscription closures is that they include the ContainerIdx (NOT the ContainerID) and therefore the indexes in the Arena have to line up perfectly if we want to keep the original Subscriptions the user has available. Unless there's a way around this?

canadaduane · 2026-01-10T15:38:10Z

@zxch3n Can you check my knowledge and assumptions here (based on convo with Claude)? I've tried to summarize these as concisely as possible. I think if we're on the same page with these two, I can implement the change per your suggestions.

1. The "Shared Handle" Architecture

Fact: LoroDoc is a lightweight handle wrapping an Arc<LoroDocInner>.
Insight: You cannot update the "global" document state by replacing the inner pointer in self. Doing so only updates the current handle, leaving all other clones (e.g., in UI components, background threads, or WASM bindings) pointing to the stale state.
Constraint: To implement a global update like replace_with_shallow, you must mutate the contents of the shared LoroDocInner in-place. Since LoroDocInner fields are immutable Arcs to mutable structures (e.g., Arc<LoroMutex<OpLog>>), this necessitates a "Swap" strategy where we lock and replace the internal data of each component.

2. The Subscription Handle Paradox

Fact: Subscription handles held by users contain a closure that captures the specific ContainerIdx and subscriber_id generated at subscription time.
Insight: If you create a new document state (even via import), it will generate new indices and IDs. Simply migrating subscriptions to the new state breaks existing handles because they will try to unsubscribe using the old IDs, which don't exist in the new set.
Constraint: To preserve handle validity, we must preserve both ContainerIdx and subscriber_id.

ContainerIdx: Preserved by pre-populating the new SharedArena with the old container mappings before import.

subscriber_id: Preserved by manually migrating subscriptions into the new SubscriberSet using a custom insert_with_id method.

zxch3n · 2026-01-10T15:54:15Z

@canadaduane Although we’re using IDX here, we can convert an IDX back to an ID via the Arena. Then, when importing on the other side into a new Doc, we can convert the ID back into an IDX again.

I don’t think we need to worry too much about performance at this stage. We should prioritize correctness first. Once we actually hit performance bottlenecks, we can profile the system and then identify which parts truly need optimization. This keeps the overall logic simpler and helps us avoid premature optimization.

The first insight here is quite good — it really does require replacing things one by one.

However, there’s one part that may get tricky: how should we handle Subscriptions when they are later unsubscribed? I haven’t thought this through deeply yet. The tricky part is that the returned unsubscribe closure is controlled by the user. We may need to embed some kind of identifier inside the closure so it can be mapped or migrated into the subscription space of the new doc.

Add replace_with_shallow to LoroDoc

4f4c994

chatgpt-codex-connector bot reviewed Jan 10, 2026

View reviewed changes

canadaduane commented Jan 10, 2026

View reviewed changes

zxch3n reviewed Jan 10, 2026

View reviewed changes

Uh oh!

Add replace_with_shallow to LoroDoc #896

Are you sure you want to change the base?

Add replace_with_shallow to LoroDoc #896

Uh oh!

Conversation

canadaduane commented Jan 10, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

canadaduane left a comment

Choose a reason for hiding this comment

Uh oh!

canadaduane Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

canadaduane Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

zxch3n Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

canadaduane Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

canadaduane Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

canadaduane commented Jan 10, 2026

1. The "Shared Handle" Architecture

2. The Subscription Handle Paradox

Uh oh!

zxch3n commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants