Add LRP merge method#677
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
This is the new the merge method I proposed. |
|
Hi, I’ve implemented a new merging method and would appreciate it if someone could approve the pending workflow so the checks can run. Happy to make any changes if needed! |
| .venv | ||
| env/ | ||
| venv/ | ||
| venv311/ |
There was a problem hiding this comment.
Personal virtual environment path in shared gitignore
Low Severity
The entry venv311/ appears to be a developer's personal Python 3.11 virtual environment directory. The shared .gitignore already covers venv/, .venv, env/, and other standard patterns. Machine-specific paths belong in a personal global gitignore or .git/info/exclude, not the project-level .gitignore.
| if lrp_scores: | ||
| make_task_kwargs["lrp_scores"] = lrp_scores | ||
|
|
||
| tensor_task = tensor_merge_method.make_task(**make_task_kwargs) |
There was a problem hiding this comment.
Passing lrp_scores kwarg breaks GTA merge methods
Medium Severity
When lrp_scores is non-empty, it gets unconditionally added to make_task_kwargs and passed to whatever merge method is in use. GeneralizedTaskArithmeticMerge.make_task (covering task_arithmetic, ties, dare_*, della_*, etc.) does not accept **kwargs, so this causes a TypeError if a user sets lrp_path in their model config while using any of those methods. The lrp_scores kwarg needs to be passed only when the active merge method is LRP.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.


This PR adds LRP (Layer-wise Relevance Propagation) Merge - a new merging method that uses LRP importance scores to determine which weights to keep
during model merging.
What is LRP Merge?
LRP was originally developed for interpreting neural network predictions by propagating relevance scores backward through the network. LRP Merge adapts
this technique for model merging by using LRP-computed importance scores to guide which weights should be preserved.
Algorithm
delta = fine_tuned - basedensityparameterFiles Added
mergekit/merge_methods/lrp.py- Main LRP merge implementationmergekit/lrp_computer.py- LRP score computation moduletests/test_lrp_merge.py- Test coveragedocs/lrp_merge.md- Detailed documentationParameters
density(global): Fraction of weights to retain (default: 0.7)weight(per-model): Contribution weight for each modelTesting
All 6 tests pass including:
Note
Medium Risk
Introduces a new merge algorithm and extends merge planning/config to pass per-model LRP score paths into merge tasks, which could affect merge correctness and may surface edge cases (e.g., shape handling, unexpected extra kwargs when
lrp_pathis set with other methods).Overview
Adds a new
lrpmerge method that sparsifies each model’s task vector using top‑k importance masks (driven by per-model LRP score files when provided, otherwise|delta|magnitude) and then performs a weighted average before adding back to the base.Extends merge configuration and planning to accept an optional
lrp_pathper input model/slice and forwards amodel_ref -> lrp_pathmapping into the merge method. Updates documentation to describe LRP Merge and adds a CLI utility (mergekit.lrp_computer) for generatinglrp_scores.pt, plus unit tests covering density parameter validation and basic merges.Written by Cursor Bugbot for commit bfd1c89. This will update automatically on new commits. Configure here.