From add408a9e2e76939bc69360d2aa2e6940d069cab Mon Sep 17 00:00:00 2001 From: Bihan Rana Date: Mon, 24 Nov 2025 18:08:16 +0545 Subject: [PATCH 1/3] Add dstack install method in docs Updated the dstack section Fix trailing whitespaces Fix missing backslash --- docs/get_started/install.md | 78 ++++++++++++++++++++++++++++++++++++- 1 file changed, 77 insertions(+), 1 deletion(-) diff --git a/docs/get_started/install.md b/docs/get_started/install.md index 259e4b646a70..b82efc3c6486 100644 --- a/docs/get_started/install.md +++ b/docs/get_started/install.md @@ -154,7 +154,83 @@ sky status --endpoint 30000 sglang -## Method 7: Run on AWS SageMaker +## Method 7: Using dstack + +
+More + +[dstack](https://github.com/dstackai/dstack) simplifies GPU provisioning and workload orchestration across clouds, Kubernetes, and on-prem systems. + +Deploying SGLang as a secure, auto-scalable endpoint is straightforward: + +1. Install dstack: see [dstack's documentation](https://dstack.ai/docs/installation/) +2. Create a dstack [service](https://dstack.ai/docs/concepts/services/): + +
+Service configuration: service.yaml + +```yaml +type: service +name: qwen + +image: lmsysorg/sglang:latest +env: + - MODEL_ID=qwen/qwen2.5-0.5b-instruct +commands: + - | + python3 -m sglang.launch_server \ + --model-path $MODEL_ID \ + --port 8000 \ + --trust-remote-code +port: 8000 +model: qwen/qwen2.5-0.5b-instruct + +resources: + gpu: 8GB..24GB:1 +``` +
+ +Apply the configuration: + +```bash +HF_TOKEN= dstack apply -f service.yaml +``` + +3. If you want to enable auto-scaling, cache-aware routing, HTTPS, or bring your own custom domain, +create a [gateway](https://dstack.ai/docs/concepts/gateways/): + +
+Gateway configuration: gateway.yaml + +```yaml +type: gateway +name: sglang-gateway + +backend: aws +region: eu-west-1 + +# Specify your domain +domain: example.com + +router: + # (Optional) Enable cache-aware routing + type: sglang + policy: cache_aware +``` +
+ +Apply the gateway configuration. + +```bash +dstack apply -f gateway.yaml +``` + +Once the gateway is assigned a hostname, go to your domain's DNS settings and add a DNS record for `*.`. + +See the [SGLang example](https://dstack.ai/examples/inference/sglang/) for more details. +
+ +## Method 8: Run on AWS SageMaker
More From b179c1dcccdc86b29f59a8a8710e16cd33e58658 Mon Sep 17 00:00:00 2001 From: Bihan Rana Date: Tue, 25 Nov 2025 18:47:04 +0545 Subject: [PATCH 2/3] Add HF_TOKEN --- docs/get_started/install.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/get_started/install.md b/docs/get_started/install.md index b82efc3c6486..0137110cdbcd 100644 --- a/docs/get_started/install.md +++ b/docs/get_started/install.md @@ -175,6 +175,7 @@ name: qwen image: lmsysorg/sglang:latest env: + - HF_TOKEN - MODEL_ID=qwen/qwen2.5-0.5b-instruct commands: - | From 8f2aa91bef2d013a42cb00e772586418eec51403 Mon Sep 17 00:00:00 2001 From: Bihan Rana Date: Wed, 4 Feb 2026 17:22:11 +0545 Subject: [PATCH 3/3] Add service endpoint URL --- docs/get_started/install.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/get_started/install.md b/docs/get_started/install.md index 0137110cdbcd..fd04110e8610 100644 --- a/docs/get_started/install.md +++ b/docs/get_started/install.md @@ -226,7 +226,9 @@ Apply the gateway configuration. dstack apply -f gateway.yaml ``` -Once the gateway is assigned a hostname, go to your domain's DNS settings and add a DNS record for `*.`. +Once the gateway is assigned a hostname, go to your domain's DNS settings and add `A` record so that `*.` points to `` + +You can then access the service endpoint at `https://./` See the [SGLang example](https://dstack.ai/examples/inference/sglang/) for more details.