Merkle tree recovery by lejeunerenard · Pull Request #767 · holepunchto/hypercore

lejeunerenard · 2026-02-04T18:30:18Z

First step is to allow missing roots when readying a hypercore. The roots were used to determine the length based on their span. If the roots are not available the header's tree length value is used.

The general technique to recover a merkle tree node, is to pass a fully remote proof from a peer with the node to the affected peer.

A fully remote proof can be generated targeting the node (via it's merkle index which is not equivalent but related to it's block index) by calling:
```
const proof = await core.generateRemoteProofForTreeNode(treeNodeIndex)
```
The proof can then be verified and applied on the peer missing the node like so:
```
await core.recoverFromRemoteProof(proof)
```

Since proofs include the nodes as part of the proof, the peer without the node can directly write the nodes after verifying them. A new upgrade argument was added to fully-remote-proof.js's proof() to support overriding the default upgrade for the sender's length. This allows the proof generated for recovery to target subtree roots.

Tests were added to demonstrate this flow for both root nodes and subtree root nodes. A new assertion was added for the fully-remote-proof.js tests as well for the added upgrade argument.

The length of the core is computed via the span of the merkle roots but is also loaded via the header's tree length.

Picks the rightSpan as the index to get an upgrade for to target the index.

This allows the generated proof to target a merkle tree node even if it's a sub-root of the current roots.

If the requesting core receives a data message with the upgrade proof and when checking for a conflict errors when checking the local proof, then attempt to apply the upgrade nodes from the remote locally and recheck. This allows cores to automatically repair when detecting a conflict in cases where the tree nodes are missing.

lib/core.js

In case this function is used elsewhere in the future.

lib/core.js

lib/fully-remote-proof.js

This prevents tree nodes from being modified during other merkle tree modification. Also ensures that the checks and modifications are atomic so that at the time of repair the tree used to verify the proof will be the tree modified.

Helpful for logging the failure case when a proof is out of date. Is the inverse of the `repaired` event. Finally helps deflake the tests for merkle tree recovery which can fail when just waiting on `repairing` event as the tree nodes expected to be applied (because the proof passed by a peer was valid) are not applied yet when checking the tree node to be applied.

This makes it clearer that the check for the node is after either success or failure. Especially helpful for showing that the test fails if you remove the state mutex locks in `_repairTreeNodes()`.

This mode makes the test for truncating race condition moot as it will throw when trying to truncate now. Appends are also protected.

lejeunerenard · 2026-02-05T21:10:04Z

In addition to manual fully remote proofs, merkle tree roots can be repaired automatically when a core is opened with missing root nodes and core.recoverTreeNodeFromPeers() is called when replicating. This requests an upgrade from all peers which will attempt to repair with the proof response.

Repair Mode

A core opened with no roots but a header tree length and overwrite is not enabled will enabled repair mode. While in repair mode no appending nor truncating is supported. Once the core is repaired it will need to be closed and reopened to use normally.

The repair lifecycle can be tracked via the following added events:

repairing
A proof was received and is being verified before applying.
repair-failed
Repairing by applying a proof failed. This can be caused by one of two reasons: the proof was invalid or after applying the proof's tree nodes the core was still not valid.
repaired
The core was successfully repaired via a remote proof.

Test cases for showcasing repairing via remote peer and preventing appending and truncating while in repair mode were added.

lib/core.js

Disabled by setting `pushOnly` mode.

Also do not send `sync` message as a repairing core is not a valid source for an upgrade. Set pushOnly mode as soon as repair mode is enabled.

Guards prevent requests from peers sending messages that could change the merkle tree mid update. Core will also require reopening to disable the repair mode.

lib/fully-remote-proof.js

mafintosh · 2026-02-10T11:07:36Z

lib/merkle-tree.js

    if (tx === undefined) throw INVALID_OPERATION('No database batch was passed')
+
+    if (this.session.core._repairMode) {
+      throw Error('Cannot commit while repair mode is on')


be nice to use a typed error from hypercore-errors

Used ASSERTION and changed the assert() in lib/session-state.js to use the same for the same error. These will ensure its thrown as uncaughts by safety-catch.

This allows the hypercore to automatically repair itself when it enters repair mode.

Was using the merkle index so was outputting the wrong hash.

lejeunerenard added 6 commits February 3, 2026 15:42

Support loading a core w/o merkle root(s)

e1a8112

The length of the core is computed via the span of the merkle roots but is also loaded via the header's tree length.

Create proof of concept in test for restoring merkle nodes via proof

b938093

Simplify PoC for repairing w/ remote merkle proof

beedba9

Add .recoverFromRemoteProof() to load tree nodes from remote proof

2d57d08

Add .generateRemoteProofForTreeNode() to make remote proof for node

3034214

Picks the rightSpan as the index to get an upgrade for to target the index.

Add upgrade option to fully-remote-proof's proof()

e8917b1

This allows the generated proof to target a merkle tree node even if it's a sub-root of the current roots.

lejeunerenard requested a review from mafintosh February 4, 2026 18:30

lejeunerenard added 7 commits February 4, 2026 12:31

Remove unused import for fully-remote-proof.js

f9182d7

Obligatory fix for core that isn't closed

f307a6f

Close all shared storages in merkle recovery tests on teardown

2395274

Close forgotten core on teardown in merkle recovery tests

e08bcfe

Add missing rx for retesting local proof & lint test

13fb74d

Refactor _repairTreeNodes() out into a separate method

3ddc747

mafintosh reviewed Feb 4, 2026

View reviewed changes

lib/core.js Outdated Show resolved Hide resolved

lejeunerenard added 3 commits February 4, 2026 17:44

Add missing await & verify proof in _repairTreeNodes()

322876f

Emit repairing & repair events when attempting repair of tree nodes

395b943

Emit repairing in the _repairTreeNodes() function

f7e3148

In case this function is used elsewhere in the future.

mafintosh reviewed Feb 5, 2026

View reviewed changes

lib/core.js Outdated Show resolved Hide resolved

mafintosh reviewed Feb 5, 2026

View reviewed changes

lib/core.js Outdated Show resolved Hide resolved

mafintosh reviewed Feb 5, 2026

View reviewed changes

lib/fully-remote-proof.js Outdated Show resolved Hide resolved

lejeunerenard added 8 commits February 5, 2026 10:19

Throw if error is not INVALID_OPERATION when loading roots for core

ee66eba

Refactor getting roots into a function

2273140

Add state mutex lock when repairing merkle tree nodes

df22217

This prevents tree nodes from being modified during other merkle tree modification. Also ensures that the checks and modifications are atomic so that at the time of repair the tree used to verify the proof will be the tree modified.

Add mutex (un)lock to recoverFromRemoteProof()

fb5b2f6

Adjust race condition w/ truncate test for repairing to use both cases

a1a9095

This makes it clearer that the check for the node is after either success or failure. Especially helpful for showing that the test fails if you remove the state mutex locks in `_repairTreeNodes()`.

Add repair mode when loading merkle tree is missing roots

28c356e

This mode makes the test for truncating race condition moot as it will throw when trying to truncate now. Appends are also protected.

Lint

7ee8e56

lejeunerenard requested a review from mafintosh February 5, 2026 21:10

Fix typo & tests checking error messages

5dd0f4a

mafintosh reviewed Feb 6, 2026

View reviewed changes

lib/core.js Outdated Show resolved Hide resolved

lejeunerenard added 7 commits February 6, 2026 10:07

Rename module scoped helper function w/o _

1ba2d7a

Use none of the roots if any are missing in fully remote proof verify

d0589d4

Merge branch 'main' into merkle-tree-recovery

49ba910

Disable outbound requests (besides forced req) while repairing

2eb87e5

Disabled by setting `pushOnly` mode.

Ignore msg & skip incoming data when in repair mode

ea4268f

Also do not send `sync` message as a repairing core is not a valid source for an upgrade. Set pushOnly mode as soon as repair mode is enabled.

Add pushOnly & repair mode guards to recoverFromRemoteProof()

b57c9f5

Guards prevent requests from peers sending messages that could change the merkle tree mid update. Core will also require reopening to disable the repair mode.

Skip sending blocks when in repair mode

c6342aa

mafintosh reviewed Feb 10, 2026

View reviewed changes

lib/fully-remote-proof.js Outdated Show resolved Hide resolved

mafintosh reviewed Feb 10, 2026

View reviewed changes

lejeunerenard added 5 commits February 10, 2026 10:45

Cache null check as package scope function

834bf44

Use ASSERTION typed error when committing in repair mode

bda9c20

Fix lint lib/fully-remote-proof.js

82e4e5d

Fire recoverTreeNodeFromPeers() automatically when peer added

eee7850

This allows the hypercore to automatically repair itself when it enters repair mode.

Fix treeHash(seq) calls in test to use block space index

837a6d0

Was using the merkle index so was outputting the wrong hash.

mafintosh merged commit 1e2a027 into main Feb 12, 2026
5 checks passed

mafintosh deleted the merkle-tree-recovery branch February 12, 2026 22:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merkle tree recovery#767

Merkle tree recovery#767
mafintosh merged 37 commits intomainfrom
merkle-tree-recovery

lejeunerenard commented Feb 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lejeunerenard commented Feb 5, 2026

Uh oh!

Uh oh!

Uh oh!

mafintosh Feb 10, 2026

Uh oh!

lejeunerenard Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lejeunerenard commented Feb 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lejeunerenard commented Feb 5, 2026

Repair Mode

Uh oh!

Uh oh!

Uh oh!

mafintosh Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

lejeunerenard Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants