docs: Adding the Gateway inference support documentation for Nginx Ga… (#1789)

sindhushiv · web-flow · commit 5b32ebc49807 · 2025-11-14T07:05:38.000-08:00
* docs: Adding the Gateway inference support documentation for Nginx Gateway Fabric

* docs: Addressing comments

* docs: Addressed new set of comments

* docs: Fixed the helm command

* docs: Fixed cleaned up command

* docs: Fixed cleaned up command

* docs: Adding released version

* docs: Fixing the YAML files for test failure

* docs: Fixing the HTTP YAML file for test failure

* docs: Addressing the review comments
diff --git a/config/manifests/gateway/nginxgatewayfabric/gateway.yaml b/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
@@ -0,0 +1,10 @@
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: inference-gateway
+spec:
+  gatewayClassName: nginx
+  listeners:
+  - name: http
+    port: 80
+    protocol: HTTP
diff --git a/config/manifests/gateway/nginxgatewayfabric/httproute.yaml b/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
@@ -0,0 +1,18 @@
+apiVersion: gateway.networking.k8s.io/v1
+kind: HTTPRoute
+metadata:
+  name: llm-route
+  namespace: default
+spec:
+  parentRefs:
+  - name: inference-gateway
+  rules:
+  - matches:
+    - path:
+        type: PathPrefix
+        value: /
+    backendRefs:
+    - group: inference.networking.k8s.io
+      kind: InferencePool
+      name: vllm-llama3-8b-instruct
+
diff --git a/site-src/_includes/epp-latest.md b/site-src/_includes/epp-latest.md
@@ -30,3 +30,14 @@
       --version $IGW_CHART_VERSION \
       oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
       ```
+
+=== "NGINX Gateway Fabric"
+
+      ```bash
+      export GATEWAY_PROVIDER=none
+      helm install vllm-llama3-8b-instruct \
+      --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
+      --set provider.name=$GATEWAY_PROVIDER \
+      --version $IGW_CHART_VERSION \
+      oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
+      ```
diff --git a/site-src/_includes/epp.md b/site-src/_includes/epp.md
@@ -30,3 +30,14 @@
       --version $IGW_CHART_VERSION \
       oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
       ```
+
+=== "NGINX Gateway Fabric"
+
+      ```bash
+      export GATEWAY_PROVIDER=none
+      helm install vllm-llama3-8b-instruct \
+      --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
+      --set provider.name=$GATEWAY_PROVIDER \
+      --version $IGW_CHART_VERSION \
+      oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
+      ```
diff --git a/site-src/guides/getting-started-latest.md b/site-src/guides/getting-started-latest.md
@@ -199,6 +199,69 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
          kubectl get httproute llm-route -o yaml
          ```
 
+=== "NGINX Gateway Fabric"
+
+      NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
+
+      1. Requirements
+
+         - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
+         - [Helm](https://helm.sh/docs/intro/install/) installed.
+         - A Kubernetes cluster with LoadBalancer or NodePort access.
+
+      2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value
+
+         ```bash 
+         helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
+         ```
+         This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.
+
+      3. Deploy the Gateway
+
+         ```bash
+         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
+         ```
+
+      4. Verify the Gateway status
+         
+         Ensure that the Gateway is running and has been assigned an address:
+
+         ```bash
+         kubectl get gateway inference-gateway
+         ```
+
+         Check that the Gateway has been successfully provisioned and that its status shows Programmed=True
+      
+      5. Deploy the HTTPRoute
+         
+         Create the HTTPRoute resource to route traffic to your InferencePool:
+
+         ```bash
+         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
+         ```
+
+      6. Verify the route status
+
+         Check that the HTTPRoute was successfully configured and references were resolved:
+
+         ```bash
+         kubectl get httproute llm-route -o yaml
+         ```
+         
+         The route status should include Accepted=True and ResolvedRefs=True.
+
+      7. Verify the InferencePool Status
+
+         Make sure the InferencePool is active before sending traffic.
+
+         ```bash
+         kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
+         ```
+
+         Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
+      
+       For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
+
 ### Deploy InferenceObjective (Optional)
 
 Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -290,3 +353,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
          ```bash
          kubectl delete ns kgateway-system
          ```
+
+=== "NGINX Gateway Fabric"
+
+      Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
+
+
+      1. Remove Inference Gateway and HTTPRoute:
+
+         ```bash
+         kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
+         kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
+         ```
+
+      2. Uninstall NGINX Gateway Fabric:
+
+         ```bash
+         helm uninstall ngf -n nginx-gateway
+         ```
+
+      3. Clean up namespace:
+   
+         ```bash
+         kubectl delete ns nginx-gateway
+         ```
diff --git a/site-src/guides/index.md b/site-src/guides/index.md
@@ -88,6 +88,22 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
          helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
          ```
 
+=== "NGINX Gateway Fabric"
+
+      1. Requirements
+
+         - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
+         - [Helm](https://helm.sh/docs/intro/install/) installed.
+         - A Kubernetes cluster with LoadBalancer or NodePort access.
+
+      2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value
+
+         ```bash 
+         helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
+         ```
+         This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.
+
+
 ### Deploy the InferencePool and Endpoint Picker Extension
 
    Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
@@ -195,6 +211,57 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
          kubectl get httproute llm-route -o yaml
          ```
 
+=== "NGINX Gateway Fabric"
+
+      NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.
+
+      1. Deploy the Gateway
+
+         ```bash
+         kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
+         ```
+
+      2. Verify the Gateway status
+         
+         Ensure that the Gateway is running and has been assigned an address:
+
+         ```bash
+         kubectl get gateway inference-gateway
+         ```
+
+         Check that the Gateway has been successfully provisioned and that its status shows Programmed=True
+      
+      3. Deploy the HTTPRoute
+         
+         Create the HTTPRoute resource to route traffic to your InferencePool:
+
+         ```bash
+         kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
+         ```
+
+      4. Verify the route status
+
+         Check that the HTTPRoute was successfully configured and references were resolved:
+
+         ```bash
+         kubectl get httproute llm-route -o yaml
+         ```
+         
+         The route status should include Accepted=True and ResolvedRefs=True.
+
+      5. Verify the InferencePool Status
+
+         Make sure the InferencePool is active before sending traffic.
+
+         ```bash
+         kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
+         ```
+
+         Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.
+      
+       For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)
+
+
 ### Deploy InferenceObjective (Optional)
 
 Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -287,3 +354,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
          ```bash
          kubectl delete ns kgateway-system
          ```
+
+=== "NGINX Gateway Fabric"
+
+      Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.
+
+
+      1. Remove Inference Gateway and HTTPRoute:
+
+         ```bash
+         kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
+         kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.2/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
+         ```
+
+      2. Uninstall NGINX Gateway Fabric:
+
+         ```bash
+         helm uninstall ngf -n nginx-gateway
+         ```
+
+      3. Clean up namespace:
+   
+         ```bash
+         kubectl delete ns nginx-gateway
+         ```
diff --git a/site-src/implementations/gateways.md b/site-src/implementations/gateways.md
@@ -9,13 +9,15 @@ This project has several implementations that are planned or in progress:
   - [Istio](#istio)
   - [Kgateway](#kgateway)
   - [Kubvernor](#kubvernor)
+  - [NGINX Gateway Fabric](#nginx-gateway-fabric)
 
 [1]:#alibaba-cloud-container-service-for-kubernetes
 [2]:#envoy-ai-gateway
 [3]:#google-kubernetes-engine
 [4]:#istio
 [5]:#kgateway
 [6]:#kubvernor
+[7]:#nginx-gateway-fabric
 
 Agentgateway can run independently or can be managed by [Kgateway](https://kgateway.dev/).
 
@@ -98,3 +100,10 @@ Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway
 [krg]:https://github.com/kubvernor/kubvernor
 [krgu]: https://github.com/kubvernor/kubvernor/blob/main/README.md
 
+## NGINX Gateway Fabric
+
+[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website.
+
+[nginx-gateway-fabric]: https://github.com/nginx/nginx-gateway-fabric
+[nginx]:https://nginx.org/
+[nginx-docs]:https://docs.nginx.com/nginx-gateway-fabric/