Add LRP merge method by Tusm11 · Pull Request #677 · arcee-ai/mergekit

Tusm11 · 2026-04-02T12:42:04Z

This PR adds LRP (Layer-wise Relevance Propagation) Merge - a new merging method that uses LRP importance scores to determine which weights to keep
during model merging.

What is LRP Merge?

LRP was originally developed for interpreting neural network predictions by propagating relevance scores backward through the network. LRP Merge adapts
this technique for model merging by using LRP-computed importance scores to guide which weights should be preserved.

Algorithm

Compute task vectors: delta = fine_tuned - base
Get LRP importance scores (fallback to magnitude if not available)
Select top-k weights by importance using density parameter
Weighted average of sparse deltas
Add back to base model

Files Added

mergekit/merge_methods/lrp.py - Main LRP merge implementation
mergekit/lrp_computer.py - LRP score computation module
tests/test_lrp_merge.py - Test coverage
docs/lrp_merge.md - Detailed documentation

Parameters

density (global): Fraction of weights to retain (default: 0.7)
weight (per-model): Contribution weight for each model

Testing

All 6 tests pass including:

Basic merge with various density values
Invalid density validation (negative and > 1.0)

Note

Medium Risk
Introduces a new merge algorithm and extends merge planning/config to pass per-model LRP score paths into merge tasks, which could affect merge correctness and may surface edge cases (e.g., shape handling, unexpected extra kwargs when lrp_path is set with other methods).

Overview
Adds a new lrp merge method that sparsifies each model’s task vector using top‑k importance masks (driven by per-model LRP score files when provided, otherwise |delta| magnitude) and then performs a weighted average before adding back to the base.

Extends merge configuration and planning to accept an optional lrp_path per input model/slice and forwards a model_ref -> lrp_path mapping into the merge method. Updates documentation to describe LRP Merge and adds a CLI utility (mergekit.lrp_computer) for generating lrp_scores.pt, plus unit tests covering density parameter validation and basic merges.

^{Written by Cursor Bugbot for commit bfd1c89. This will update automatically on new commits. Configure here.}

github-actions · 2026-04-02T12:42:17Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

Tusm11 · 2026-04-02T12:45:45Z

I have read the CLA Document and I hereby sign the CLA

Tusm11

I checked everything

… gradients

…ariable

Tusm11 · 2026-04-02T13:41:10Z

This is the new the merge method I proposed.

Tusm11 · 2026-04-03T04:32:44Z

Hi, I’ve implemented a new merging method and would appreciate it if someone could approve the pending workflow so the checks can run. Happy to make any changes if needed!

cursor · 2026-04-03T10:32:58Z

 .venv
 env/
 venv/
+venv311/


Personal virtual environment path in shared gitignore

Low Severity

The entry venv311/ appears to be a developer's personal Python 3.11 virtual environment directory. The shared .gitignore already covers venv/, .venv, env/, and other standard patterns. Machine-specific paths belong in a personal global gitignore or .git/info/exclude, not the project-level .gitignore.

… dead code

cursor · 2026-04-03T11:51:03Z

+        if lrp_scores:
+            make_task_kwargs["lrp_scores"] = lrp_scores
+
+        tensor_task = tensor_merge_method.make_task(**make_task_kwargs)


Passing lrp_scores kwarg breaks GTA merge methods

Medium Severity

When lrp_scores is non-empty, it gets unconditionally added to make_task_kwargs and passed to whatever merge method is in use. GeneralizedTaskArithmeticMerge.make_task (covering task_arithmetic, ties, dare_*, della_*, etc.) does not accept **kwargs, so this causes a TypeError if a user sets lrp_path in their model config while using any of those methods. The lrp_scores kwarg needs to be passed only when the active merge method is LRP.

…xact k elements

… tensors

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

…smatch

Tusm11

I think the fix is complete

Add LRP merge method

2acc1cd

cursor Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread mergekit/merge_methods/lrp.py Outdated

Comment thread mergekit/lrp_computer.py

Comment thread mergekit/lrp_computer.py Outdated

Tusm11 commented Apr 2, 2026

View reviewed changes

Fix: rectify embed sizes, implement actual LRP rules, aggregate batch…

f80e21e

… gradients

cursor Bot reviewed Apr 2, 2026

View reviewed changes

Comment thread mergekit/merge_methods/lrp.py

Comment thread mergekit/lrp_computer.py Outdated

Fix: add LRPMerge to STATIC_MERGE_METHODS, remove unused total_grad v…

cdcba3a

…ariable

Tusm11 added 2 commits April 3, 2026 11:26

fix: pre-commit formatting

0b1f22c

fix: resolve lazy tensor loader compatibility and ensure all tests pass

c1969d2