You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: Adding the Gateway inference support documentation for Nginx Ga… (#1789)
* docs: Adding the Gateway inference support documentation for Nginx Gateway Fabric
* docs: Addressing comments
* docs: Addressed new set of comments
* docs: Fixed the helm command
* docs: Fixed cleaned up command
* docs: Fixed cleaned up command
* docs: Adding released version
* docs: Fixing the YAML files for test failure
* docs: Fixing the HTTP YAML file for test failure
* docs: Addressing the review comments
NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
205
+
206
+
1. Requirements
207
+
208
+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
262
+
263
+
For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
264
+
202
265
### Deploy InferenceObjective (Optional)
203
266
204
267
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -290,3 +353,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
290
353
```bash
291
354
kubectl delete ns kgateway-system
292
355
```
356
+
357
+
=== "NGINX Gateway Fabric"
358
+
359
+
Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.
105
+
106
+
91
107
### Deploy the InferencePool and Endpoint Picker Extension
92
108
93
109
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
261
+
262
+
For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
263
+
264
+
198
265
### Deploy InferenceObjective (Optional)
199
266
200
267
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -287,3 +354,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
287
354
```bash
288
355
kubectl delete ns kgateway-system
289
356
```
357
+
358
+
=== "NGINX Gateway Fabric"
359
+
360
+
Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website.
0 commit comments