-
-
Notifications
You must be signed in to change notification settings - Fork 119
Add replace_with_shallow to LoroDoc #896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4f4c994051
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /// Replace the current document state with a shallow snapshot at the given frontiers. | ||
| /// | ||
| /// This method trims the history in place, preserving subscriptions and configuration. | ||
| #[wasm_bindgen(js_name = "replaceWithShallow")] | ||
| pub fn replace_with_shallow(&self, frontiers: Vec<JsID>) -> JsResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flush pending WASM events for replaceWithShallow
This new WASM-exposed method mutates document state and can enqueue subscription callbacks, but it isn’t added to the JS-side decorateMethods allowlist that calls callPendingEvents() in a finally block. With active subscriptions, calling replaceWithShallow can therefore trigger [LORO_INTERNAL_ERROR] Event not called and drop callbacks until another decorated method is invoked. Please add the JS name (replaceWithShallow) to the allowlist in crates/loro-wasm/index.ts to preserve the pending-event invariant.
Useful? React with 👍 / 👎.
| let mut state = self.state.lock().unwrap(); | ||
| let mut new_state = temp_doc.state.lock().unwrap(); | ||
|
|
||
| std::mem::swap(&mut state.frontiers, &mut new_state.frontiers); | ||
| std::mem::swap(&mut state.store, &mut new_state.store); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep ContainerStore arena aligned after replace_with_shallow
replace_with_shallow swaps state.store with the temporary document’s store but leaves state.arena untouched, after already swapping inner arena contents between self.arena and new_arena. Since ContainerStore owns its own SharedArena, the doc now has two different arena instances (state.arena vs state.store.arena). Subsequent container registrations and lookups can update only one arena, causing container index mismatches and incorrect resolution after replace_with_shallow (e.g., creating new containers or resolving paths). Consider swapping/updating the store’s arena to match state.arena or rebuilding the store in-place.
Useful? React with 👍 / 👎.
canadaduane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of explanatory comments.
| } | ||
|
|
||
| #[test] | ||
| #[serial_test::serial] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because the memory test was failing when run in parallel, due to memory allocated by parallel tests.
| /// and unreliable for same-thread lock detection | ||
| /// - Internal callers in `shallow_snapshot.rs` legitimately call this without | ||
| /// holding the txn lock directly, but are protected at the public API level | ||
| /// by `with_barrier()` in methods like `export()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note: this comment needs review. It was generated by Claude, and I believe it is correct, but without expertise in this library I am not 100% sure it is correct. The reason for removing the assert!(self.txn.is_locked()) assertion below is that _checkout_without_emitting does not need to be called within a lock if it is during replace_with_shallow.
| self._apply_diff(diff, &mut Default::default(), false) | ||
| } | ||
|
|
||
| pub fn replace_with_shallow(&self, frontiers: &Frontiers) -> LoroResult<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea for this part of the implementation is that internally we can first directly call the ExposedShallowSnapshot API to export a snapshot, and then import it into a new document. The new document can be created directly via the FromSnapshot method.
The subsequent steps would include:
-
Migrate metadata and state:
- (a) Migrate subscriptions, events, and similar components from the old document to the new one.
- (b) Assign the Peer ID to the new document.
- (c) Migrate all relevant state to the new document.
-
Replace pointers:
- (a) After the state migration is complete, directly replace the old document’s internal pointer with the pointer from the new document.
- (b) From that point on, all state references reuse the new document, and the old internal logic of the original document can be entirely discarded.
The benefits of this approach are:
- Stronger overall correctness guarantees: we don’t need to introduce extensive internal logic changes.
- Reduced maintenance burden: if we modify too much internal logic, every future optimization would require re-examining this area to ensure it hasn’t been broken by new changes.
- Built on public APIs: most of the logic in this approach is based on public interfaces.
At the moment, the only real risk lies in the step mentioned above that migrates existing events. To handle this, we would need an additional API, something like replace_doc_inplace. As long as we can guarantee the correctness of this function, we can guarantee the correctness of the entire feature.
Overall, this approach should be more stable and easier to use, and it also gives us a function that we can reuse in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correctness is the highest priority, so I will take your advice. But I worry a little bit about speed--is creating a new document going to be as fast as mem::swap on the arena?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional quirk to address: my understanding of the Subscription closures is that they include the ContainerIdx (NOT the ContainerID) and therefore the indexes in the Arena have to line up perfectly if we want to keep the original Subscriptions the user has available. Unless there's a way around this?
|
@zxch3n Can you check my knowledge and assumptions here (based on convo with Claude)? I've tried to summarize these as concisely as possible. I think if we're on the same page with these two, I can implement the change per your suggestions.
|
|
@canadaduane Although we’re using IDX here, we can convert an IDX back to an ID via the Arena. Then, when importing on the other side into a new Doc, we can convert the ID back into an IDX again. I don’t think we need to worry too much about performance at this stage. We should prioritize correctness first. Once we actually hit performance bottlenecks, we can profile the system and then identify which parts truly need optimization. This keeps the overall logic simpler and helps us avoid premature optimization. The first insight here is quite good — it really does require replacing things one by one. However, there’s one part that may get tricky: how should we handle Subscriptions when they are later unsubscribed? I haven’t thought this through deeply yet. The tricky part is that the returned unsubscribe closure is controlled by the user. We may need to embed some kind of identifier inside the closure so it can be mapped or migrated into the subscription space of the new doc. |
This PR adds
replace_with_shallow, a way to modify a LoroDoc and "trim in place", discarding old operations up to a Frontier.