Skip to content

gcanlin/vllm-study

Repository files navigation

vLLM Study

Static study notes for vLLM internals.

The site is published by GitHub Actions to GitHub Pages from the site/ directory.

Current page:

  • DeepSeek V3 FusedMoE, EP, large EP, and EPLB notes: site/index.html
  • Qwen3 GPUModelRunner notes: site/qwen3_model_runner.html
  • Qwen3 model structure notes: site/qwen3_structure.html
  • Qwen3-Omni streaming audio/text/video input walkthrough: site/qwen3_omni_streaming_inputs.html
  • vLLM Context Parallel notes: site/vllm_cp_pcp_dcp.html
  • vLLM-Omni Hunyuan-Image3.0 and BAGEL architecture notes: site/vllm_omni_hunyuan_bagel_arch.html
  • vLLM Prefix Cache notes: site/vllm_prefix_cache.html
  • vLLM-Omni streaming input/output and async_chunk notes: site/vllm_omni_streaming.html
  • vLLM Sampler notes: site/vllm_sampler.html

Skills:

  • Feature walkthrough article workflow: skills/vllm-study-feature-walkthrough/SKILL.md

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages