Skip to content

Commit 5b32ebc

Browse files
authored
docs: Adding the Gateway inference support documentation for Nginx Ga… (#1789)
* docs: Adding the Gateway inference support documentation for Nginx Gateway Fabric * docs: Addressing comments * docs: Addressed new set of comments * docs: Fixed the helm command * docs: Fixed cleaned up command * docs: Fixed cleaned up command * docs: Adding released version * docs: Fixing the YAML files for test failure * docs: Fixing the HTTP YAML file for test failure * docs: Addressing the review comments
1 parent 93fca7c commit 5b32ebc

File tree

7 files changed

+237
-0
lines changed

7 files changed

+237
-0
lines changed
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: Gateway
3+
metadata:
4+
name: inference-gateway
5+
spec:
6+
gatewayClassName: nginx
7+
listeners:
8+
- name: http
9+
port: 80
10+
protocol: HTTP
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
apiVersion: gateway.networking.k8s.io/v1
2+
kind: HTTPRoute
3+
metadata:
4+
name: llm-route
5+
namespace: default
6+
spec:
7+
parentRefs:
8+
- name: inference-gateway
9+
rules:
10+
- matches:
11+
- path:
12+
type: PathPrefix
13+
value: /
14+
backendRefs:
15+
- group: inference.networking.k8s.io
16+
kind: InferencePool
17+
name: vllm-llama3-8b-instruct
18+

site-src/_includes/epp-latest.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,14 @@
3030
--version $IGW_CHART_VERSION \
3131
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
3232
```
33+
34+
=== "NGINX Gateway Fabric"
35+
36+
```bash
37+
export GATEWAY_PROVIDER=none
38+
helm install vllm-llama3-8b-instruct \
39+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
40+
--set provider.name=$GATEWAY_PROVIDER \
41+
--version $IGW_CHART_VERSION \
42+
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
43+
```

site-src/_includes/epp.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,14 @@
3030
--version $IGW_CHART_VERSION \
3131
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
3232
```
33+
34+
=== "NGINX Gateway Fabric"
35+
36+
```bash
37+
export GATEWAY_PROVIDER=none
38+
helm install vllm-llama3-8b-instruct \
39+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
40+
--set provider.name=$GATEWAY_PROVIDER \
41+
--version $IGW_CHART_VERSION \
42+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
43+
```

site-src/guides/getting-started-latest.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,69 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
199199
kubectl get httproute llm-route -o yaml
200200
```
201201

202+
=== "NGINX Gateway Fabric"
203+
204+
NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
205+
206+
1. Requirements
207+
208+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
209+
- [Helm](https://helm.sh/docs/intro/install/) installed.
210+
- A Kubernetes cluster with LoadBalancer or NodePort access.
211+
212+
2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value
213+
214+
```bash
215+
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
216+
```
217+
This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.
218+
219+
3. Deploy the Gateway
220+
221+
```bash
222+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
223+
```
224+
225+
4. Verify the Gateway status
226+
227+
Ensure that the Gateway is running and has been assigned an address:
228+
229+
```bash
230+
kubectl get gateway inference-gateway
231+
```
232+
233+
Check that the Gateway has been successfully provisioned and that its status shows Programmed=True
234+
235+
5. Deploy the HTTPRoute
236+
237+
Create the HTTPRoute resource to route traffic to your InferencePool:
238+
239+
```bash
240+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
241+
```
242+
243+
6. Verify the route status
244+
245+
Check that the HTTPRoute was successfully configured and references were resolved:
246+
247+
```bash
248+
kubectl get httproute llm-route -o yaml
249+
```
250+
251+
The route status should include Accepted=True and ResolvedRefs=True.
252+
253+
7. Verify the InferencePool Status
254+
255+
Make sure the InferencePool is active before sending traffic.
256+
257+
```bash
258+
kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
259+
```
260+
261+
Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
262+
263+
For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
264+
202265
### Deploy InferenceObjective (Optional)
203266

204267
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -290,3 +353,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
290353
```bash
291354
kubectl delete ns kgateway-system
292355
```
356+
357+
=== "NGINX Gateway Fabric"
358+
359+
Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
360+
361+
362+
1. Remove Inference Gateway and HTTPRoute:
363+
364+
```bash
365+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
366+
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
367+
```
368+
369+
2. Uninstall NGINX Gateway Fabric:
370+
371+
```bash
372+
helm uninstall ngf -n nginx-gateway
373+
```
374+
375+
3. Clean up namespace:
376+
377+
```bash
378+
kubectl delete ns nginx-gateway
379+
```

site-src/guides/index.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,22 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
8888
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
8989
```
9090

91+
=== "NGINX Gateway Fabric"
92+
93+
1. Requirements
94+
95+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
96+
- [Helm](https://helm.sh/docs/intro/install/) installed.
97+
- A Kubernetes cluster with LoadBalancer or NodePort access.
98+
99+
2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value
100+
101+
```bash
102+
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
103+
```
104+
This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.
105+
106+
91107
### Deploy the InferencePool and Endpoint Picker Extension
92108

93109
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
@@ -195,6 +211,57 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
195211
kubectl get httproute llm-route -o yaml
196212
```
197213

214+
=== "NGINX Gateway Fabric"
215+
216+
NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
217+
218+
1. Deploy the Gateway
219+
220+
```bash
221+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
222+
```
223+
224+
2. Verify the Gateway status
225+
226+
Ensure that the Gateway is running and has been assigned an address:
227+
228+
```bash
229+
kubectl get gateway inference-gateway
230+
```
231+
232+
Check that the Gateway has been successfully provisioned and that its status shows Programmed=True
233+
234+
3. Deploy the HTTPRoute
235+
236+
Create the HTTPRoute resource to route traffic to your InferencePool:
237+
238+
```bash
239+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
240+
```
241+
242+
4. Verify the route status
243+
244+
Check that the HTTPRoute was successfully configured and references were resolved:
245+
246+
```bash
247+
kubectl get httproute llm-route -o yaml
248+
```
249+
250+
The route status should include Accepted=True and ResolvedRefs=True.
251+
252+
5. Verify the InferencePool Status
253+
254+
Make sure the InferencePool is active before sending traffic.
255+
256+
```bash
257+
kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
258+
```
259+
260+
Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
261+
262+
For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
263+
264+
198265
### Deploy InferenceObjective (Optional)
199266

200267
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -287,3 +354,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
287354
```bash
288355
kubectl delete ns kgateway-system
289356
```
357+
358+
=== "NGINX Gateway Fabric"
359+
360+
Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
361+
362+
363+
1. Remove Inference Gateway and HTTPRoute:
364+
365+
```bash
366+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
367+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
368+
```
369+
370+
2. Uninstall NGINX Gateway Fabric:
371+
372+
```bash
373+
helm uninstall ngf -n nginx-gateway
374+
```
375+
376+
3. Clean up namespace:
377+
378+
```bash
379+
kubectl delete ns nginx-gateway
380+
```

site-src/implementations/gateways.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,15 @@ This project has several implementations that are planned or in progress:
99
- [Istio](#istio)
1010
- [Kgateway](#kgateway)
1111
- [Kubvernor](#kubvernor)
12+
- [NGINX Gateway Fabric](#nginx-gateway-fabric)
1213

1314
[1]:#alibaba-cloud-container-service-for-kubernetes
1415
[2]:#envoy-ai-gateway
1516
[3]:#google-kubernetes-engine
1617
[4]:#istio
1718
[5]:#kgateway
1819
[6]:#kubvernor
20+
[7]:#nginx-gateway-fabric
1921

2022
Agentgateway can run independently or can be managed by [Kgateway](https://kgateway.dev/).
2123

@@ -98,3 +100,10 @@ Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway
98100
[krg]:https://github.com/kubvernor/kubvernor
99101
[krgu]: https://github.com/kubvernor/kubvernor/blob/main/README.md
100102

103+
## NGINX Gateway Fabric
104+
105+
[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website.
106+
107+
[nginx-gateway-fabric]: https://github.com/nginx/nginx-gateway-fabric
108+
[nginx]:https://nginx.org/
109+
[nginx-docs]:https://docs.nginx.com/nginx-gateway-fabric/

0 commit comments

Comments
 (0)