Update to Transformers v5 by cg123 · Pull Request #692 · arcee-ai/mergekit

cg123 · 2026-06-17T19:39:49Z

Note

High Risk
Core merge/load paths and MoE output formats changed for a major Transformers bump; incorrect conversion or optional-weight logic could silently drop or mis-merge expert weights.

Overview
Upgrades mergekit to Transformers 5 (transformers>=5.0,<6.0, peft>=0.18.0) and adds a conversion layer that uses Hugging Face’s checkpoint conversion registry to map legacy checkpoint keys to the v5 model layout (e.g. packed gate_up_proj / down_proj for MoE experts).

Tensor loading and architecture inference now fall back to conversion when a weight name is missing on disk; auto-inference builds weight templates from the base model’s v5 state_dict and treats weights as present only if conversion (or a direct key) works across all merge inputs and every layer.

Mixtral and Qwen3 MoE move from Python ModuleArchitecture classes to JSON definitions with v5 tensor names; MoE merge writers assemble per-expert sources and save via conversion instead of manual block_sparse_moe / per-expert MLP remapping. Glm4 MoE layer templates align with the same packed expert layout.

Smaller changes: no_init_weights replaces custom NoInit in evolve; MoE router uses BitsAndBytesConfig for 4/8-bit loads; mergekit-evolve is deprecated with a runtime warning.

^{Reviewed by Cursor Bugbot for commit 0d59494. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 17f7785. Configure here.}

cg123 added 4 commits June 17, 2026 11:17

First pass

b31f5a5

Auto arch improvement

900e1ee

Better optional weight handling in auto arch

2bd86a3

Fix

17f7785

cursor Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread mergekit/moe/mixtral.py

Add test, tweak message

0d59494

cg123 merged commit 207a692 into main Jun 17, 2026
11 checks passed

cg123 deleted the transformers-v5 branch June 17, 2026 19:52

github-actions Bot locked and limited conversation to collaborators Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update to Transformers v5#692

Update to Transformers v5#692
cg123 merged 5 commits into
mainfrom
transformers-v5

cg123 commented Jun 17, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cg123 commented Jun 17, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cg123 commented Jun 17, 2026 •

edited by cursor Bot

Loading