-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[fix] /pause_generation and /continue_generation wrong for --tokenizer-worker-num > 1
#24445
opened May 5, 2026 by
maocheng23
Contributor
Loading…
4 tasks
deepseek_v2: route mega-MoE pre-dispatch through DeepGEMM + FP4 acts opt-in
deepseek
#24444
opened May 5, 2026 by
pranjalssh
Contributor
Loading…
3 tasks done
test(prefill-delayer): pin offline-gen throughput test to triton backend
#24442
opened May 5, 2026 by
YAMY1234
Contributor
Loading…
2 tasks
[Docs] Add B200, GB200, GB300 NVIDIA hardware platform support for Kimi-K2.6
documentation
Improvements or additions to documentation
fix(req_pool): bump pool.size to match actual tensor row count after #24243
run-ci
#24439
opened May 5, 2026 by
JustinTong0323
Collaborator
Loading…
1 of 3 tasks
Update Qwen3-Coder docs_new NVIDIA guidance
documentation
Improvements or additions to documentation
#24435
opened May 5, 2026 by
wenscarl
Collaborator
Loading…
[NemotronH] Fix expert scale weight loading
#24434
opened May 5, 2026 by
chfeng-cs
Loading…
5 tasks done
[model-support] Add support for Bamba
#24430
opened May 5, 2026 by
ppraneth
Contributor
Loading…
5 tasks
[Fix] Stop resetting cache_hit_rate to 0 in report_decode_stats (#20451)
#24427
opened May 5, 2026 by
yangyonggit
Loading…
2 tasks done
[AMD] Add MiMo-V2.5-Pro in nightly tests for MI30x and MI35x
amd
run-ci
#24426
opened May 5, 2026 by
michaelzhang-ai
Collaborator
Loading…
[Fix] Guard against None logprob fields when retracted request is streamed (#23154)
#24423
opened May 5, 2026 by
yangyonggit
Loading…
2 tasks done
[codex] Add diffusion performance mode defaults
diffusion
SGLang Diffusion
documentation
Improvements or additions to documentation
jit-kernel
#24419
opened May 5, 2026 by
mickqian
Collaborator
Loading…
fix(gateway): use http1_only() on health check client
model-gateway
#24418
opened May 5, 2026 by
bhaktatejas922
Loading…
[Fix] Fix KV transfer metrics using wrong time window
#24415
opened May 5, 2026 by
yangyonggit
Loading…
7 tasks done
[vlm][pixtral] support precomputed embeddings + processor output
Multi-modal
multi-modal language model
npu
Feat/true on policy qwen moe
deterministic
Issues on deterministic inference/kernels
documentation
Improvements or additions to documentation
Multi-modal
multi-modal language model
npu
quant
LLM Quantization
#24408
opened May 5, 2026 by
maocheng23
Contributor
•
Draft
5 tasks
[Test] Add unit tests for srt/layers/utils/logprob.py
#24406
opened May 5, 2026 by
dsuarez01
Loading…
9 of 11 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.