Skip to content
This repository was archived by the owner on May 15, 2025. It is now read-only.

Commit 17a23e5

Browse files
committed
[fix] Fix the kind environemnt and set gateway service to be NodePort
Signed-off-by: Kfir Toledo <[email protected]>
1 parent 937bb50 commit 17a23e5

File tree

11 files changed

+33
-28
lines changed

11 files changed

+33
-28
lines changed

DEVELOPMENT.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ export REGISTRY_SECRET=anna-pull-secret
191191

192192
Set the `VLLM_MODE` environment variable based on which version of vLLM you want to deploy:
193193

194-
* `vllm-sim`: Lightweight simulator for simple environments (defult).
194+
* `vllm-sim`: Lightweight simulator for simple environments (default).
195195
* `vllm`: Full vLLM model server, using GPU/CPU for inferencing
196196
* `vllm-p2p`: Full vLLM with LMCache P2P support for enable KV-Cache aware routing
197197

@@ -224,19 +224,19 @@ kubectl port-forward service/inference-gateway 8080:80
224224

225225
And making requests with `curl`:
226226

227-
- vllm-sim
227+
**vllm-sim:**
228228

229-
```bash
230-
curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
231-
-d '{"model":"food-review","prompt":"hi","max_tokens":10,"temperature":0}' | jq
232-
```
229+
```bash
230+
curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
231+
-d '{"model":"food-review","prompt":"hi","max_tokens":10,"temperature":0}' | jq
232+
```
233233

234-
- vllm or vllm-p2p
234+
**vllm or vllm-p2p:**
235235

236-
```bash
237-
curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
238-
-d '{"model":"meta-llama/Llama-3.1-8B-Instruct","prompt":"hi","max_tokens":10,"temperature":0}' | jq
239-
```
236+
```bash
237+
curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
238+
-d '{"model":"meta-llama/Llama-3.1-8B-Instruct","prompt":"hi","max_tokens":10,"temperature":0}' | jq
239+
```
240240

241241
#### Environment Configurateion
242242

deploy/components/inference-gateway/deployments.yaml

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,11 +48,3 @@ spec:
4848
service: inference-extension
4949
initialDelaySeconds: 5
5050
periodSeconds: 10
51-
env:
52-
- name: KVCACHE_INDEXER_REDIS_ADDR
53-
value: ${REDIS_HOST}:${REDIS_PORT}
54-
- name: HF_TOKEN
55-
valueFrom:
56-
secretKeyRef:
57-
name: ${HF_SECRET_NAME}
58-
key: ${HF_SECRET_KEY}

deploy/components/inference-gateway/kustomization.yaml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,6 @@ resources:
2626
- deployments.yaml
2727
- gateways.yaml
2828
- httproutes.yaml
29-
- secret.yaml
30-
3129

3230
images:
3331
- name: quay.io/vllm-d/gateway-api-inference-extension/epp

deploy/components/vllm/kustomization.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,4 @@ images:
3333
configMapGenerator:
3434
- name: vllm-model-config
3535
literals:
36-
- MODEL_NAME=${MODEL_NAME}
36+
- MODEL_NAME=${MODEL_NAME}

deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ spec:
1111
runAsNonRoot: true
1212
runAsUser: "${PROXY_UID}"
1313
service:
14-
type: LoadBalancer
14+
type: ${GATEWAY_SERVICE_TYPE}
1515
extraLabels:
1616
gateway: custom
1717
podTemplate:

deploy/environments/dev/kubernetes-kgateway/kustomization.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ kind: Kustomization
44
namespace: ${NAMESPACE}
55

66
resources:
7+
- secret.yaml
78
- ../../../components/inference-gateway/
89
- gateway-parameters.yaml
910

@@ -14,4 +15,4 @@ images:
1415

1516
patches:
1617
- path: patch-deployments.yaml
17-
- path: patch-gateways.yaml
18+
- path: patch-gateways.yaml

deploy/environments/dev/kubernetes-kgateway/patch-deployments.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,11 @@ spec:
2222
- "9002"
2323
- -grpcHealthPort
2424
- "9003"
25+
env:
26+
- name: KVCACHE_INDEXER_REDIS_ADDR
27+
value: ${REDIS_HOST}:${REDIS_PORT}
28+
- name: HF_TOKEN
29+
valueFrom:
30+
secretKeyRef:
31+
name: hf-token
32+
key: ${HF_SECRET_KEY}

deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ resources:
55
- ../../../../components/vllm/
66

77
images:
8-
- name: quay.io/vllm-d/vllm-d-dev:0.0.2
8+
- name: quay.io/vllm-d/vllm-d-dev
99
newName: ${VLLM_IMAGE}
1010
newTag: ${VLLM_TAG}
1111

scripts/kind-dev-env.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,11 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
2525
# Set the host port to map to the Gateway's inbound port (30080)
2626
: "${GATEWAY_HOST_PORT:=30080}"
2727

28+
# Set the inference pool name for the deployment
29+
export POOL_NAME="${POOL_NAME:-vllm-llama3-8b-instruct}"
30+
31+
# Set the model name to deploy
32+
export MODEL_NAME="${MODEL_NAME:-meta-llama/Llama-3.1-8B-Instruct}"
2833
# ------------------------------------------------------------------------------
2934
# Setup & Requirement Checks
3035
# ------------------------------------------------------------------------------
@@ -113,7 +118,7 @@ kustomize build --enable-helm deploy/components/crds-kgateway |
113118

114119
# Deploy the environment to the "default" namespace
115120
kustomize build --enable-helm deploy/environments/dev/kind-kgateway \
116-
| sed "s/REPLACE_NAMESPACE/${PROJECT_NAMESPACE}/gI" \
121+
| envsubst | sed "s/REPLACE_NAMESPACE/${PROJECT_NAMESPACE}/gI" \
117122
| kubectl --context ${KUBE_CONTEXT} apply -f -
118123

119124
# Wait for all control-plane pods to be ready

0 commit comments

Comments
 (0)