diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index 8b6cf443..f6a7c466 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -37,7 +37,7 @@ serving resources.
 
 Run the following:
 
-```console
+```bash
 make environment.dev.kind
 ```
 
@@ -48,6 +48,7 @@ namespace.
 There are several ways to access the gateway:
 
 **Port forward**:
+
 ```sh
 $ kubectl --context kind-gie-dev port-forward service/inference-gateway 8080:80
 ```
@@ -55,6 +56,7 @@ $ kubectl --context kind-gie-dev port-forward service/inference-gateway 8080:80
 **NodePort `inference-gateway-istio`**
 > **Warning**: This method doesn't work on `podman` correctly, as `podman` support
 > with `kind` is not fully implemented yet.
+
 ```sh
 # Determine the k8s node address
 $ kubectl --context kind-gie-dev get node -o yaml | grep address
@@ -80,9 +82,10 @@ By default the created inference gateway, can be accessed on port 30080. This ca
 be overriden to any free port in the range of 30000 to 32767, by running the above
 command as follows:
 
-```console
+```bash
 GATEWAY_HOST_PORT=<selected-port> make environment.dev.kind
 ```
+
 **Where:** &lt;selected-port&gt; is the port on your local machine you want to use to
 access the inference gatyeway.
 
@@ -96,7 +99,7 @@ access the inference gatyeway.
 To test your changes to the GIE in this environment, make your changes locally
 and then run the following:
 
-```console
+```bash
 make environment.dev.kind.update
 ```
 
@@ -122,7 +125,7 @@ the `default` namespace if the cluster is private/personal).
 The following will deploy all the infrastructure-level requirements (e.g. CRDs,
 Operators, etc) to support the namespace-level development environments:
 
-```console
+```bash
 make environment.dev.kubernetes.infrastructure
 ```
 
@@ -140,7 +143,7 @@ To deploy a development environment to the cluster you'll need to explicitly
 provide a namespace. This can be `default` if this is your personal cluster,
 but on a shared cluster you should pick something unique. For example:
 
-```console
+```bash
 export NAMESPACE=annas-dev-environment
 ```
 
@@ -149,10 +152,18 @@ export NAMESPACE=annas-dev-environment
 
 Create the namespace:
 
-```console
+```bash
 kubectl create namespace ${NAMESPACE}
 ```
 
+Set the default namespace for kubectl commands
+
+```bash
+kubectl config set-context --current --namespace="${NAMESPACE}"
+```
+
+> NOTE: If you are using OpenShift (oc CLI), use the following instead: `oc project "${NAMESPACE}"`
+
 You'll need to provide a `Secret` with the login credentials for your private
 repository (e.g. quay.io). It should look something like this:
 
@@ -168,51 +179,115 @@ type: kubernetes.io/dockerconfigjson
 
 Apply that to your namespace:
 
-```console
-kubectl -n ${NAMESPACE} apply -f secret.yaml
+```bash
+kubectl apply -f secret.yaml
 ```
 
 Export the name of the `Secret` to the environment:
 
-```console
+```bash
 export REGISTRY_SECRET=anna-pull-secret
 ```
 
-Now you need to provide several other environment variables. You'll need to
-indicate the location and tag of the `vllm-sim` image:
+Set the `VLLM_MODE` environment variable based on which version of vLLM you want to deploy:
 
-```console
-export VLLM_SIM_IMAGE="<YOUR_REGISTRY>/<YOUR_IMAGE>"
-export VLLM_SIM_TAG="<YOUR_TAG>"
+* `vllm-sim`: Lightweight simulator for simple environments (default).
+* `vllm`: Full vLLM model server, using GPU/CPU for inferencing
+* `vllm-p2p`: Full vLLM with LMCache P2P support for enable KV-Cache aware routing
+
+```bash
+export VLLM_MODE=vllm-sim  # or vllm / vllm-p2p
 ```
 
-The same thing will need to be done for the EPP:
+- Set Hugging Face token variable:
 
-```console
-export EPP_IMAGE="<YOUR_REGISTRY>/<YOUR_IMAGE>"
-export EPP_TAG="<YOUR_TAG>"
+```bash
+export HF_TOKEN="<HF_TOKEN>"
 ```
 
+**Warning**: For vllm mode, the default image uses llama3-8b. Make sure you have permission to access these files in their respective repositories.
+
+**Note:** The model can be replaced. See [Environment Configuration](#environment-configuration) for model settings.
+
 Once all this is set up, you can deploy the environment:
 
-```console
+```bash
 make environment.dev.kubernetes
 ```
 
 This will deploy the entire stack to whatever namespace you chose. You can test
 by exposing the inference `Gateway` via port-forward:
 
-```console
-kubectl -n ${NAMESPACE} port-forward service/inference-gateway-istio 8080:80
+```bash
+kubectl port-forward service/inference-gateway 8080:80
 ```
 
 And making requests with `curl`:
 
-```console
+**vllm-sim:**
+
+```bash
 curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
   -d '{"model":"food-review","prompt":"hi","max_tokens":10,"temperature":0}' | jq
 ```
 
+**vllm or vllm-p2p:**
+
+```bash
+curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: application/json' \
+  -d '{"model":"meta-llama/Llama-3.1-8B-Instruct","prompt":"hi","max_tokens":10,"temperature":0}' | jq
+```
+
+#### Environment Configurateion
+
+**1. Setting the EPP image and tag:**
+
+You can optionally set a custom EPP image (otherwise, the default will be used):
+
+```bash
+export EPP_IMAGE="<YOUR_REGISTRY>/<YOUR_IMAGE>"
+export EPP_TAG="<YOUR_TAG>"
+```
+
+**2. Setting the vLLM image and tag:**
+
+Each vLLM mode has default image values, but you can override them:
+
+For `vllm-sim` mode:**
+
+```bash
+export VLLM_SIM_IMAGE="<YOUR_REGISTRY>/<YOUR_IMAGE>"
+export VLLM_SIM_TAG="<YOUR_TAG>"
+```
+
+For `vllm` and `vllm-p2p` modes:**
+
+```bash
+export VLLM_IMAGE="<YOUR_REGISTRY>/<YOUR_IMAGE>"
+export VLLM_TAG="<YOUR_TAG>"
+```
+
+**3. Setting the model name and label:**
+
+You can replace the model name that will be used in the system.
+
+```bash
+export MODEL_NAME="${MODEL_NAME:-mistralai/Mistral-7B-Instruct-v0.2}"
+export MODEL_LABEL="${MODEL_LABEL:-mistral7b}"
+```
+
+It is also recommended to update the inference pool name accordingly so that it aligns with the models:
+
+```bash
+export POOL_NAME="${POOL_NAME:-vllm-Mistral-7B-Instruct}"
+```
+
+**4. Additional environment settings:**
+
+More Setting of environment variables can be found in the `scripts/kubernetes-dev-env.sh`.
+
+
+
 #### Development Cycle
 
 > **WARNING**: This is a very manual process at the moment. We expect to make
@@ -221,19 +296,19 @@ curl -s -w '\n' http://localhost:8080/v1/completions -H 'Content-Type: applicati
 Make your changes locally and commit them. Then select an image tag based on
 the `git` SHA:
 
-```console
+```bash
 export EPP_TAG=$(git rev-parse HEAD)
 ```
 
 Build the image:
 
-```console
+```bash
 DEV_VERSION=$EPP_TAG make image-build
 ```
 
 Tag the image for your private registry and push it:
 
-```console
+```bash
 $CONTAINER_RUNTIME tag quay.io/vllm-d/gateway-api-inference-extension/epp:$TAG \
     <MY_REGISTRY>/<MY_IMAGE>:$EPP_TAG
 $CONTAINER_RUNTIME push <MY_REGISTRY>/<MY_IMAGE>:$EPP_TAG
@@ -245,7 +320,7 @@ $CONTAINER_RUNTIME push <MY_REGISTRY>/<MY_IMAGE>:$EPP_TAG
 Then you can re-deploy the environment with the new changes (don't forget all
 the required env vars):
 
-```console
+```bash
 make environment.dev.kubernetes
 ```
 
diff --git a/Makefile b/Makefile
index 641d6cf6..471e95a9 100644
--- a/Makefile
+++ b/Makefile
@@ -784,11 +784,8 @@ environment.dev.kubernetes: check-kubectl check-kustomize check-envsubst
 # ------------------------------------------------------------------------------
 .PHONY: clean.environment.dev.kubernetes
 clean.environment.dev.kubernetes: check-kubectl check-kustomize check-envsubst
-ifndef NAMESPACE
-	$(error "Error: NAMESPACE is required but not set")
-endif
-	@echo "INFO: cleaning up dev environment in $(NAMESPACE)"
-	kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst | kubectl -n "${NAMESPACE}" delete -f -
+	@CLEAN=true ./scripts/kubernetes-dev-env.sh 2>&1
+	@echo "INFO: Finished cleanup of development environment for $(VLLM_MODE) mode in namespace $(NAMESPACE)"
 
 # -----------------------------------------------------------------------------
 # TODO: these are old aliases that we still need for the moment, but will be
diff --git a/deploy/components/inference-gateway/deployments.yaml b/deploy/components/inference-gateway/deployments.yaml
index 0fc19d4d..afff8fd2 100644
--- a/deploy/components/inference-gateway/deployments.yaml
+++ b/deploy/components/inference-gateway/deployments.yaml
@@ -22,7 +22,7 @@ spec:
         imagePullPolicy: IfNotPresent
         args:
         - -poolName
-        - "vllm-llama3-8b-instruct"
+        - "${POOL_NAME}"
         - -v
         - "4"
         - --zap-encoder
diff --git a/deploy/components/inference-gateway/httproutes.yaml b/deploy/components/inference-gateway/httproutes.yaml
index 1115d13d..97eb2cf3 100644
--- a/deploy/components/inference-gateway/httproutes.yaml
+++ b/deploy/components/inference-gateway/httproutes.yaml
@@ -13,7 +13,7 @@ spec:
     backendRefs:
     - group: inference.networking.x-k8s.io
       kind: InferencePool
-      name: vllm-llama3-8b-instruct
+      name: ${POOL_NAME}
       port: 8000
     timeouts:
       request: 30s
diff --git a/deploy/components/inference-gateway/inference-models.yaml b/deploy/components/inference-gateway/inference-models.yaml
index 12a51394..869be700 100644
--- a/deploy/components/inference-gateway/inference-models.yaml
+++ b/deploy/components/inference-gateway/inference-models.yaml
@@ -6,7 +6,17 @@ spec:
   modelName: food-review
   criticality: Critical
   poolRef:
-    name: vllm-llama3-8b-instruct
+    name: ${POOL_NAME}
   targetModels:
   - name: food-review
     weight: 100
+---
+apiVersion: inference.networking.x-k8s.io/v1alpha2
+kind: InferenceModel
+metadata:
+  name: base-model
+spec:
+  modelName: ${MODEL_NAME}
+  criticality: Critical
+  poolRef:
+    name: ${POOL_NAME}
diff --git a/deploy/components/inference-gateway/inference-pools.yaml b/deploy/components/inference-gateway/inference-pools.yaml
index ece6e500..3a981a14 100644
--- a/deploy/components/inference-gateway/inference-pools.yaml
+++ b/deploy/components/inference-gateway/inference-pools.yaml
@@ -1,10 +1,10 @@
 apiVersion: inference.networking.x-k8s.io/v1alpha2
 kind: InferencePool
 metadata:
-  name: vllm-llama3-8b-instruct
+  name: ${POOL_NAME}
 spec:
   targetPortNumber: 8000
   selector:
-    app: vllm-llama3-8b-instruct
+    app: ${POOL_NAME}
   extensionRef:
     name: endpoint-picker
diff --git a/deploy/components/vllm-p2p/kustomization.yaml b/deploy/components/vllm-p2p/kustomization.yaml
new file mode 100644
index 00000000..1b4c0b28
--- /dev/null
+++ b/deploy/components/vllm-p2p/kustomization.yaml
@@ -0,0 +1,32 @@
+# ------------------------------------------------------------------------------
+# vLLM P2P Deployment
+#
+# This deploys the full vLLM model server, capable of serving real models such
+# as Llama 3.1-8B-Instruct via the OpenAI-compatible API. It is intended for
+# environments with GPU resources and where full inference capabilities are
+# required.
+# in additon it add LMcache  a LLM serving engine extension using Redis to vLLM image
+#
+# The deployment can be customized using environment variables to set:
+#   - The container image and tag (VLLM_IMAGE, VLLM_TAG)
+#   - The model to load (MODEL_NAME)
+#
+# This setup is suitable for testing on Kubernetes (including
+# GPU-enabled nodes or clusters with scheduling for `nvidia.com/gpu`).
+# -----------------------------------------------------------------------------
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+
+resources:
+  - vllm-deployment.yaml
+  - redis-deployment.yaml
+  - redis-service.yaml
+  - secret.yaml
+
+images:
+  - name: vllm/vllm-openai
+    newName: ${VLLM_IMAGE}
+    newTag: ${VLLM_TAG}
+  - name: redis
+    newName: ${REDIS_IMAGE}
+    newTag: ${REDIS_TAG}
diff --git a/deploy/components/vllm-p2p/redis-deployment.yaml b/deploy/components/vllm-p2p/redis-deployment.yaml
new file mode 100644
index 00000000..31b329e4
--- /dev/null
+++ b/deploy/components/vllm-p2p/redis-deployment.yaml
@@ -0,0 +1,50 @@
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ${REDIS_DEPLOYMENT_NAME}
+  labels:
+    app.kubernetes.io/name: redis
+    app.kubernetes.io/component: redis-lookup-server
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: redis
+      app.kubernetes.io/component: redis-lookup-server
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: redis
+        app.kubernetes.io/component: redis-lookup-server
+    spec:
+      containers:
+        - name: lookup-server
+          image: ${REDIS_IMAGE}:${REDIS_TAG}
+          imagePullPolicy: IfNotPresent
+          command:
+            - redis-server
+          ports:
+            - name: redis-port
+              containerPort: ${REDIS_TARGET_PORT}
+              protocol: TCP
+          resources:
+            limits:
+              cpu: "4"
+              memory: 10G
+            requests:
+              cpu: "4"
+              memory: 8G
+          terminationMessagePath: /dev/termination-log
+          terminationMessagePolicy: File
+      restartPolicy: Always
+      terminationGracePeriodSeconds: 30
+      dnsPolicy: ClusterFirst
+      securityContext: {}
+      schedulerName: default-scheduler
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxUnavailable: 25%
+      maxSurge: 25%
+  revisionHistoryLimit: 10
+  progressDeadlineSeconds: 600
diff --git a/deploy/components/vllm-p2p/redis-service.yaml b/deploy/components/vllm-p2p/redis-service.yaml
new file mode 100644
index 00000000..a5d5fd00
--- /dev/null
+++ b/deploy/components/vllm-p2p/redis-service.yaml
@@ -0,0 +1,17 @@
+apiVersion: v1
+kind: Service
+metadata:
+  name: ${REDIS_SVC_NAME}
+  labels:
+    app.kubernetes.io/name: redis
+    app.kubernetes.io/component: redis-lookup-server
+spec:
+  ports:
+    - name: lookupserver-port
+      protocol: TCP
+      port: ${REDIS_PORT}
+      targetPort: ${REDIS_TARGET_PORT}
+  type: ${REDIS_SERVICE_TYPE}
+  selector:
+    app.kubernetes.io/name: redis
+    app.kubernetes.io/component: redis-lookup-server
diff --git a/deploy/components/vllm-p2p/secret.yaml b/deploy/components/vllm-p2p/secret.yaml
new file mode 100644
index 00000000..23fe9473
--- /dev/null
+++ b/deploy/components/vllm-p2p/secret.yaml
@@ -0,0 +1,10 @@
+apiVersion: v1
+kind: Secret
+metadata:
+  name: ${HF_SECRET_NAME}
+  labels:
+    app.kubernetes.io/name: vllm
+    app.kubernetes.io/component: secret
+type: Opaque
+data:
+  ${HF_SECRET_KEY}: ${HF_TOKEN}
diff --git a/deploy/components/vllm-p2p/vllm-deployment.yaml b/deploy/components/vllm-p2p/vllm-deployment.yaml
new file mode 100644
index 00000000..19fd59c2
--- /dev/null
+++ b/deploy/components/vllm-p2p/vllm-deployment.yaml
@@ -0,0 +1,118 @@
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ${VLLM_DEPLOYMENT_NAME}
+  labels:
+    app.kubernetes.io/name: vllm
+    app.kubernetes.io/model: ${MODEL_LABEL}
+    app.kubernetes.io/component: vllm
+spec:
+  replicas: ${VLLM_REPLICA_COUNT}
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: vllm
+      app.kubernetes.io/component: vllm
+      app.kubernetes.io/model: ${MODEL_LABEL}
+      app: ${POOL_NAME}
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: vllm
+        app.kubernetes.io/component: vllm
+        app.kubernetes.io/model: ${MODEL_LABEL}
+        app: ${POOL_NAME}
+    spec:
+      containers:
+        - name: vllm
+          image: ${VLLM_IMAGE}:${VLLM_TAG}
+          imagePullPolicy: IfNotPresent
+          command:
+            - /bin/sh
+            - "-c"
+          args:
+            - |
+              export LMCACHE_DISTRIBUTED_URL=$${${POD_IP}}:80 && \
+              vllm serve ${MODEL_NAME} \
+              --host 0.0.0.0 \
+              --port 8000 \
+              --enable-chunked-prefill false \
+              --max-model-len ${MAX_MODEL_LEN} \
+              --kv-transfer-config '{"kv_connector":"LMCacheConnector","kv_role":"kv_both"}'
+          ports:
+            - name: http
+              containerPort: 8000
+              protocol: TCP
+            - name: lmcache-dist # Assuming port 80 is used for LMCACHE_DISTRIBUTED_URL
+              containerPort: 80
+              protocol: TCP
+          livenessProbe:
+            failureThreshold: 3
+            httpGet:
+              path: /health
+              port: 8000
+              scheme: HTTP
+            initialDelaySeconds: 15
+            periodSeconds: 10
+            successThreshold: 1
+            timeoutSeconds: 1
+          startupProbe:
+            failureThreshold: 60
+            httpGet:
+              path: /health
+              port: 8000
+              scheme: HTTP
+            initialDelaySeconds: 15
+            periodSeconds: 10
+            successThreshold: 1
+            timeoutSeconds: 1
+          env:
+            - name: HF_HOME
+              value: /data
+            - name: POD_IP
+              valueFrom:
+                fieldRef:
+                  apiVersion: v1
+                  fieldPath: status.podIP
+            - name: HF_TOKEN
+              valueFrom:
+                secretKeyRef:
+                  name: ${HF_SECRET_NAME}
+                  key: ${HF_SECRET_KEY}
+            - name: LMCACHE_LOOKUP_URL
+              value: ${REDIS_HOST}:${REDIS_PORT}
+            - name: LMCACHE_ENABLE_DEBUG
+              value: "True"
+            - name: LMCACHE_ENABLE_P2P
+              value: "True"
+            - name: LMCACHE_LOCAL_CPU
+              value: "True"
+            - name: LMCACHE_MAX_LOCAL_CPU_SIZE
+              value: "20"
+            - name: LMCACHE_USE_EXPERIMENTAL
+              value: "True"
+            - name: VLLM_RPC_TIMEOUT
+              value: "1000000"
+          resources:
+            limits:
+              nvidia.com/gpu: "1"
+            requests:
+              cpu: "${VLLM_CPU_RESOURCES}"
+              memory: 40Gi
+              nvidia.com/gpu: "1"
+          terminationMessagePath: /dev/termination-log
+          terminationMessagePolicy: File
+          securityContext:
+            runAsNonRoot: false
+      restartPolicy: Always
+      terminationGracePeriodSeconds: 30
+      dnsPolicy: ClusterFirst
+      securityContext: {}
+      schedulerName: default-scheduler
+  strategy:
+    type: RollingUpdate
+    rollingUpdate:
+      maxUnavailable: 0
+      maxSurge: "100%"
+  revisionHistoryLimit: 10
+  progressDeadlineSeconds: 1200
+
diff --git a/deploy/components/vllm-sim/deployments.yaml b/deploy/components/vllm-sim/deployments.yaml
index 4673a99c..34b742c2 100644
--- a/deploy/components/vllm-sim/deployments.yaml
+++ b/deploy/components/vllm-sim/deployments.yaml
@@ -3,16 +3,16 @@ kind: Deployment
 metadata:
   name: vllm-sim
   labels:
-    app: vllm-llama3-8b-instruct
+    app: ${POOL_NAME}
 spec:
   replicas: 1
   selector:
     matchLabels:
-      app: vllm-llama3-8b-instruct
+      app: ${POOL_NAME}
   template:
     metadata:
       labels:
-        app: vllm-llama3-8b-instruct
+        app: ${POOL_NAME}
         ai-aware-router-pod: "true"
     spec:
       containers:
diff --git a/deploy/components/vllm/configmap.yaml b/deploy/components/vllm/configmap.yaml
new file mode 100644
index 00000000..03019ce1
--- /dev/null
+++ b/deploy/components/vllm/configmap.yaml
@@ -0,0 +1,14 @@
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: lora-adapters
+data:
+  configmap.yaml: |
+      vLLMLoRAConfig:
+        name: lora-adapters
+        port: 8000
+        defaultBaseModel: ${MODEL_NAME}
+        ensureExist:
+          models:
+          - id: food-review-1
+            source: Kawon/llama3.1-food-finetune_v14_r8
diff --git a/deploy/components/vllm/deployments.yaml b/deploy/components/vllm/deployments.yaml
new file mode 100644
index 00000000..71eaa72c
--- /dev/null
+++ b/deploy/components/vllm/deployments.yaml
@@ -0,0 +1,133 @@
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ${VLLM_DEPLOYMENT_NAME}
+spec:
+  replicas: ${VLLM_REPLICA_COUNT}
+  selector:
+    matchLabels:
+      app: ${POOL_NAME}
+  template:
+    metadata:
+      labels:
+        app: ${POOL_NAME}
+    spec:
+      securityContext:
+        runAsUser: ${PROXY_UID}
+        runAsNonRoot: true
+        seccompProfile:
+          type: RuntimeDefault
+      containers:
+        - name: vllm
+          image: "vllm/vllm-openai:latest"
+          imagePullPolicy: IfNotPresent
+          command: ["python3", "-m", "vllm.entrypoints.openai.api_server"]
+          args:
+            - "--model"
+            - "${MODEL_NAME}"
+            - "--tensor-parallel-size"
+            - "1"
+            - "--port"
+            - "8000"
+            - "--max-num-seq"
+            - "1024"
+            - "--compilation-config"
+            - "3"
+            - "--enable-lora"
+            - "--max-loras"
+            - "2"
+            - "--max-lora-rank"
+            - "8"
+            - "--max-cpu-loras"
+            - "12"
+          env:
+            - name: VLLM_USE_V1
+              value: "1"
+            - name: PORT
+              value: "8000"
+            - name: HUGGING_FACE_HUB_TOKEN
+              valueFrom:
+                secretKeyRef:
+                  name: ${HF_SECRET_NAME}
+                  key: ${HF_SECRET_KEY}
+            - name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
+              value: "true"
+            - name: XDG_CACHE_HOME
+              value: /cache
+            - name: HF_HOME
+              value: /cache/huggingface
+            - name: FLASHINFER_CACHE_DIR
+              value: /cache/flashinfer
+          ports:
+            - containerPort: 8000
+              name: http
+              protocol: TCP
+          lifecycle:
+            preStop:
+              sleep:
+                seconds: 30
+          livenessProbe:
+            httpGet:
+              path: /health
+              port: http
+              scheme: HTTP
+            periodSeconds: 1
+            successThreshold: 1
+            failureThreshold: 5
+            timeoutSeconds: 1
+          readinessProbe:
+            httpGet:
+              path: /health
+              port: http
+              scheme: HTTP
+            periodSeconds: 1
+            successThreshold: 1
+            failureThreshold: 1
+            timeoutSeconds: 1
+          startupProbe:
+            httpGet:
+              path: /health
+              port: http
+              scheme: HTTP
+            failureThreshold: 600
+            initialDelaySeconds: 2
+            periodSeconds: 1
+          resources:
+            limits:
+              nvidia.com/gpu: 1
+            requests:
+              nvidia.com/gpu: 1
+          volumeMounts:
+            - mountPath: /cache
+              name: hf-cache
+            - mountPath: /dev/shm
+              name: shm
+            - mountPath: /adapters
+              name: adapters
+      initContainers:
+        - name: lora-adapter-syncer
+          tty: true
+          stdin: true
+          image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/lora-syncer:main
+          restartPolicy: Always
+          imagePullPolicy: Always
+          env:
+            - name: DYNAMIC_LORA_ROLLOUT_CONFIG
+              value: "/config/configmap.yaml"
+          volumeMounts:
+            - name: config-volume
+              mountPath: /config
+      restartPolicy: Always
+      enableServiceLinks: false
+      terminationGracePeriodSeconds: 130
+      volumes:
+        - name: hf-cache
+          emptyDir: {}
+        - name: shm
+          emptyDir:
+            medium: Memory
+        - name: adapters
+          emptyDir: {}
+        - name: config-volume
+          configMap:
+            name: lora-adapters
diff --git a/deploy/components/vllm/kustomization.yaml b/deploy/components/vllm/kustomization.yaml
new file mode 100644
index 00000000..6e0da28b
--- /dev/null
+++ b/deploy/components/vllm/kustomization.yaml
@@ -0,0 +1,36 @@
+# ------------------------------------------------------------------------------
+# vLLM Deployment
+#
+# This deploys the full vLLM model server, capable of serving real models such
+# as Llama 3.1-8B-Instruct via the OpenAI-compatible API. It is intended for
+# environments with GPU resources and where full inference capabilities are
+# required.
+#
+# The deployment can be customized using environment variables to set:
+#   - The container image and tag (VLLM_IMAGE, VLLM_TAG)
+#   - The model to load (MODEL_NAME)
+#
+# This setup is suitable for testing on Kubernetes (including
+# GPU-enabled nodes or clusters with scheduling for `nvidia.com/gpu`).
+# -----------------------------------------------------------------------------
+kind: Kustomization
+
+resources:
+- deployments.yaml
+- secret.yaml
+- configmap.yaml
+
+
+images:
+- name: vllm/vllm-openai
+  newName: ${VLLM_IMAGE}
+  newTag: ${VLLM_TAG}
+
+- name: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/lora-syncer
+  newName: ${LORA_ADAPTER_SYNCER_IMAGE}
+  newTag: ${LORA_ADAPTER_SYNCER_TAG}
+
+configMapGenerator:
+- name: vllm-model-config
+  literals:
+    - MODEL_NAME=${MODEL_NAME}
diff --git a/deploy/components/vllm/secret.yaml b/deploy/components/vllm/secret.yaml
new file mode 100644
index 00000000..23fe9473
--- /dev/null
+++ b/deploy/components/vllm/secret.yaml
@@ -0,0 +1,10 @@
+apiVersion: v1
+kind: Secret
+metadata:
+  name: ${HF_SECRET_NAME}
+  labels:
+    app.kubernetes.io/name: vllm
+    app.kubernetes.io/component: secret
+type: Opaque
+data:
+  ${HF_SECRET_KEY}: ${HF_TOKEN}
diff --git a/deploy/environments/dev/kind-istio/patch-deployments.yaml b/deploy/environments/dev/kind-istio/patch-deployments.yaml
index 874b287c..7ab6e3ad 100644
--- a/deploy/environments/dev/kind-istio/patch-deployments.yaml
+++ b/deploy/environments/dev/kind-istio/patch-deployments.yaml
@@ -9,7 +9,7 @@ spec:
       - name: epp
         args:
         - -poolName
-        - "vllm-llama3-8b-instruct"
+        - ${POOL_NAME}
         - -poolNamespace
         - "default"
         - -v
diff --git a/deploy/environments/dev/kind-kgateway/patch-deployments.yaml b/deploy/environments/dev/kind-kgateway/patch-deployments.yaml
index 874b287c..7ab6e3ad 100644
--- a/deploy/environments/dev/kind-kgateway/patch-deployments.yaml
+++ b/deploy/environments/dev/kind-kgateway/patch-deployments.yaml
@@ -9,7 +9,7 @@ spec:
       - name: epp
         args:
         - -poolName
-        - "vllm-llama3-8b-instruct"
+        - ${POOL_NAME}
         - -poolNamespace
         - "default"
         - -v
diff --git a/deploy/environments/dev/kubernetes-istio/patch-deployments.yaml b/deploy/environments/dev/kubernetes-istio/patch-deployments.yaml
index 20a17d53..a5a721b8 100644
--- a/deploy/environments/dev/kubernetes-istio/patch-deployments.yaml
+++ b/deploy/environments/dev/kubernetes-istio/patch-deployments.yaml
@@ -11,7 +11,7 @@ spec:
       - name: epp
         args:
         - -poolName
-        - "vllm-llama3-8b-instruct"
+        - ${POOL_NAME}
         - -poolNamespace
         - ${NAMESPACE}
         - -v
diff --git a/deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml b/deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml
index 3461a596..da2d91d2 100644
--- a/deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml
+++ b/deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml
@@ -3,7 +3,7 @@ kind: GatewayParameters
 metadata:
   name: custom-gw-params
 spec:
-  kube: 
+  kube:
     envoyContainer:
       securityContext:
         allowPrivilegeEscalation: false
@@ -11,12 +11,12 @@ spec:
         runAsNonRoot: true
         runAsUser: "${PROXY_UID}"
     service:
-      type: NodePort
+      type: ${GATEWAY_SERVICE_TYPE}
       extraLabels:
         gateway: custom
     podTemplate:
       extraLabels:
         gateway: custom
-      securityContext: 
+      securityContext:
         seccompProfile:
           type: RuntimeDefault
diff --git a/deploy/environments/dev/kubernetes-kgateway/kustomization.yaml b/deploy/environments/dev/kubernetes-kgateway/kustomization.yaml
index 0b7e1ed8..293119e2 100644
--- a/deploy/environments/dev/kubernetes-kgateway/kustomization.yaml
+++ b/deploy/environments/dev/kubernetes-kgateway/kustomization.yaml
@@ -4,14 +4,11 @@ kind: Kustomization
 namespace: ${NAMESPACE}
 
 resources:
-- ../../../components/vllm-sim/
+- secret.yaml
 - ../../../components/inference-gateway/
 - gateway-parameters.yaml
 
 images:
-- name: quay.io/vllm-d/vllm-sim
-  newName: ${VLLM_SIM_IMAGE}
-  newTag: ${VLLM_SIM_TAG}
 - name: quay.io/vllm-d/gateway-api-inference-extension/epp
   newName: ${EPP_IMAGE}
   newTag: ${EPP_TAG}
diff --git a/deploy/environments/dev/kubernetes-kgateway/patch-deployments.yaml b/deploy/environments/dev/kubernetes-kgateway/patch-deployments.yaml
index 20a17d53..00c87fbb 100644
--- a/deploy/environments/dev/kubernetes-kgateway/patch-deployments.yaml
+++ b/deploy/environments/dev/kubernetes-kgateway/patch-deployments.yaml
@@ -11,7 +11,7 @@ spec:
       - name: epp
         args:
         - -poolName
-        - "vllm-llama3-8b-instruct"
+        - ${POOL_NAME}
         - -poolNamespace
         - ${NAMESPACE}
         - -v
@@ -22,13 +22,11 @@ spec:
         - "9002"
         - -grpcHealthPort
         - "9003"
----
-apiVersion: apps/v1
-kind: Deployment
-metadata:
-  name: vllm-sim
-spec:
-  template:
-    spec:
-      imagePullSecrets:
-      - name: ${REGISTRY_SECRET}
+        env:
+          - name: KVCACHE_INDEXER_REDIS_ADDR
+            value: ${REDIS_HOST}:${REDIS_PORT}
+          - name: HF_TOKEN
+            valueFrom:
+              secretKeyRef:
+                name: hf-token
+                key: ${HF_SECRET_KEY}
\ No newline at end of file
diff --git a/deploy/environments/dev/kubernetes-kgateway/secret.yaml b/deploy/environments/dev/kubernetes-kgateway/secret.yaml
new file mode 100644
index 00000000..23fe9473
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-kgateway/secret.yaml
@@ -0,0 +1,10 @@
+apiVersion: v1
+kind: Secret
+metadata:
+  name: ${HF_SECRET_NAME}
+  labels:
+    app.kubernetes.io/name: vllm
+    app.kubernetes.io/component: secret
+type: Opaque
+data:
+  ${HF_SECRET_KEY}: ${HF_TOKEN}
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm-p2p/kustomization.yaml b/deploy/environments/dev/kubernetes-vllm/vllm-p2p/kustomization.yaml
new file mode 100644
index 00000000..2d378312
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm-p2p/kustomization.yaml
@@ -0,0 +1,13 @@
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+
+resources:
+- ../../../../components/vllm-p2p/
+
+images:
+- name: quay.io/vllm-d/vllm-d-dev:0.0.2
+  newName: ${VLLM_IMAGE}
+  newTag: ${VLLM_TAG}
+
+patches:
+  - path: patch-deployments.yaml
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm-p2p/patch-deployments.yaml b/deploy/environments/dev/kubernetes-vllm/vllm-p2p/patch-deployments.yaml
new file mode 100644
index 00000000..b1afb13e
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm-p2p/patch-deployments.yaml
@@ -0,0 +1,9 @@
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ${VLLM_DEPLOYMENT_NAME}
+spec:
+  template:
+    spec:
+      imagePullSecrets:
+      - name: ${REGISTRY_SECRET}
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm-sim/kustomization.yaml b/deploy/environments/dev/kubernetes-vllm/vllm-sim/kustomization.yaml
new file mode 100644
index 00000000..a45ae271
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm-sim/kustomization.yaml
@@ -0,0 +1,13 @@
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+
+resources:
+- ../../../../components/vllm-sim/
+
+images:
+- name: quay.io/vllm-d/vllm-sim
+  newTag: ${VLLM_SIM_TAG}
+
+patches:
+  - path: patch-deployments.yaml
+
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm-sim/patch-deployments.yaml b/deploy/environments/dev/kubernetes-vllm/vllm-sim/patch-deployments.yaml
new file mode 100644
index 00000000..dbb99b17
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm-sim/patch-deployments.yaml
@@ -0,0 +1,10 @@
+
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: vllm-sim
+spec:
+  template:
+    spec:
+      imagePullSecrets:
+      - name: ${REGISTRY_SECRET}
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml b/deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml
new file mode 100644
index 00000000..e512ee89
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml
@@ -0,0 +1,17 @@
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+
+resources:
+- ../../../../components/vllm/
+
+images:
+- name: quay.io/vllm-d/vllm-d-dev
+  newName: ${VLLM_IMAGE}
+  newTag: ${VLLM_TAG}
+
+- name: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/lora-syncer
+  newName: ${LORA_ADAPTER_SYNCER_IMAGE}
+  newTag: ${LORA_ADAPTER_SYNCER_TAG}
+
+patches:
+  - path: patch-deployments.yaml
diff --git a/deploy/environments/dev/kubernetes-vllm/vllm/patch-deployments.yaml b/deploy/environments/dev/kubernetes-vllm/vllm/patch-deployments.yaml
new file mode 100644
index 00000000..b1afb13e
--- /dev/null
+++ b/deploy/environments/dev/kubernetes-vllm/vllm/patch-deployments.yaml
@@ -0,0 +1,9 @@
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ${VLLM_DEPLOYMENT_NAME}
+spec:
+  template:
+    spec:
+      imagePullSecrets:
+      - name: ${REGISTRY_SECRET}
diff --git a/scripts/kind-dev-env.sh b/scripts/kind-dev-env.sh
index e40847e0..85cd988e 100755
--- a/scripts/kind-dev-env.sh
+++ b/scripts/kind-dev-env.sh
@@ -25,6 +25,11 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 # Set the host port to map to the Gateway's inbound port (30080)
 : "${GATEWAY_HOST_PORT:=30080}"
 
+# Set the inference pool name for the deployment
+export POOL_NAME="${POOL_NAME:-vllm-llama3-8b-instruct}"
+
+# Set the model name to deploy
+export MODEL_NAME="${MODEL_NAME:-meta-llama/Llama-3.1-8B-Instruct}"
 # ------------------------------------------------------------------------------
 # Setup & Requirement Checks
 # ------------------------------------------------------------------------------
@@ -113,7 +118,7 @@ kustomize build --enable-helm deploy/components/crds-kgateway |
 
 # Deploy the environment to the "default" namespace
 kustomize build --enable-helm deploy/environments/dev/kind-kgateway \
-	| sed "s/REPLACE_NAMESPACE/${PROJECT_NAMESPACE}/gI" \
+	| envsubst | sed "s/REPLACE_NAMESPACE/${PROJECT_NAMESPACE}/gI" \
 	| kubectl --context ${KUBE_CONTEXT} apply -f -
 
 # Wait for all control-plane pods to be ready
diff --git a/scripts/kubernetes-dev-env.sh b/scripts/kubernetes-dev-env.sh
index 28b84409..62027c69 100755
--- a/scripts/kubernetes-dev-env.sh
+++ b/scripts/kubernetes-dev-env.sh
@@ -12,18 +12,81 @@ set -eux
 # ------------------------------------------------------------------------------
 
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-
-# Set a default VLLM_SIM_IMAGE if not provided
-: "${VLLM_SIM_IMAGE:=quay.io/vllm-d/vllm-sim}"
-
-# Set a default VLLM_SIM_TAG if not provided
-: "${VLLM_SIM_TAG:=0.0.2}"
-
-# Set a default EPP_IMAGE if not provided
-: "${EPP_IMAGE:=us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp}"
-
-# Set a default EPP_TAG if not provided
-: "${EPP_TAG:=main}"
+export CLEAN="${CLEAN:-false}"
+
+# Validate required inputs
+if [[ -z "${NAMESPACE:-}" ]]; then
+  echo "ERROR: NAMESPACE environment variable is not set."
+  exit 1
+fi
+
+
+# GIE Configuration
+export POOL_NAME="${POOL_NAME:-vllm-llama3-8b-instruct}"
+export MODEL_NAME="${MODEL_NAME:-meta-llama/Llama-3.1-8B-Instruct}"
+export GATEWAY_SERVICE_TYPE="${GATEWAY_SERVICE_TYPE:-NodePort}"
+
+## EPP ENV VARs — currently added to all EPPs, regardless of the VLLM mode or whether they are actually needed
+export REDIS_DEPLOYMENT_NAME="${REDIS_DEPLOYMENT_NAME:-lookup-server-service}"
+export REDIS_SVC_NAME="${REDIS_SVC_NAME:-${REDIS_DEPLOYMENT_NAME}}"
+export REDIS_HOST="${REDIS_HOST:-${REDIS_SVC_NAME}.${NAMESPACE}.svc.cluster.local}"
+export REDIS_PORT="${REDIS_PORT:-8100}"
+export HF_TOKEN=$(echo -n "${HF_TOKEN}" | base64 | tr -d '\n')
+export HF_SECRET_NAME="${HF_SECRET_NAME:-hf-token}"
+export HF_SECRET_KEY="${HF_SECRET_KEY:-token}"
+# vLLM Specific Configuration node
+export VLLM_MODE="${VLLM_MODE:-vllm-sim}"
+
+case "${VLLM_MODE}" in
+  vllm-sim)
+    export VLLM_SIM_IMAGE="${VLLM_SIM_IMAGE:-quay.io/vllm-d/vllm-sim}"
+    export VLLM_SIM_TAG="${VLLM_SIM_TAG:-0.0.2}"
+    export EPP_IMAGE="${EPP_IMAGE:-us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp}"
+    export EPP_TAG="${EPP_TAG:-main}"
+    export HF_TOKEN=$(echo -n "dummy-token" | base64 | tr -d '\n')
+    ;;
+  vllm | vllm-p2p)
+    # Shared across both full model modes - // TODO - make more env variables similar
+    # TODO: Consider unifying more environment variables for consistency and reuse
+
+    export VOLUME_MOUNT_PATH="${VOLUME_MOUNT_PATH:-/data}"
+    export VLLM_REPLICA_COUNT="${VLLM_REPLICA_COUNT:-3}"
+    export MODEL_LABEL="${MODEL_LABEL:-llama3-8b}"
+    export VLLM_DEPLOYMENT_NAME="${VLLM_DEPLOYMENT_NAME:-vllm-${MODEL_LABEL}}"
+
+    if [[ "$VLLM_MODE" == "vllm" ]]; then
+      export VLLM_IMAGE="${VLLM_IMAGE:-quay.io/vllm-d/vllm-d-dev}"
+      export VLLM_TAG="${VLLM_TAG:-0.0.2}"
+      export EPP_IMAGE="${EPP_IMAGE:-quay.io/vllm-d/gateway-api-inference-extension-dev}"
+      export EPP_TAG="${EPP_TAG:-0.0.4}"
+      export MAX_MODEL_LEN="${MAX_MODEL_LEN:-8192}"
+      export PVC_NAME="${PVC_NAME:-vllm-storage-claim}"
+      export LORA_ADAPTER_SYNCER_IMAGE="${LORA_ADAPTER_SYNCER_IMAGE:-us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/lora-syncer}"
+      export LORA_ADAPTER_SYNCER_TAG="${LORA_ADAPTER_SYNCER_TAG:-v20250425-ddc3d69}"
+
+    elif [[ "$VLLM_MODE" == "vllm-p2p" ]]; then
+      export VLLM_IMAGE="${VLLM_IMAGE:-lmcache/vllm-openai}"
+      export VLLM_TAG="${VLLM_TAG:-2025-03-10}"
+      export EPP_IMAGE="${EPP_IMAGE:- quay.io/vmaroon/gateway-api-inference-extension/epp}"
+      export EPP_TAG="${EPP_TAG:-kv-aware}"
+      export MAX_MODEL_LEN="${MAX_MODEL_LEN:-32768}"
+      export PVC_NAME="${PVC_NAME:-vllm-p2p-storage-claim}"
+      export PVC_ACCESS_MODE="${PVC_ACCESS_MODE:-ReadWriteOnce}"
+      export PVC_SIZE="${PVC_SIZE:-10Gi}"
+      export PVC_STORAGE_CLASS="${PVC_STORAGE_CLASS:-standard}"
+      export REDIS_IMAGE="${REDIS_IMAGE:-redis}"
+      export REDIS_TAG="${REDIS_TAG:-7.2.3}"
+      export VLLM_CPU_RESOURCES="${VLLM_CPU_RESOURCES:-10}"
+      export POD_IP="POD_IP"
+      export REDIS_TARGET_PORT="${REDIS_TARGET_PORT:-6379}"
+      export REDIS_SERVICE_TYPE="${REDIS_SERVICE_TYPE:-ClusterIP}"
+    fi
+    ;;
+  *)
+    echo "ERROR: Unsupported VLLM_MODE: ${VLLM_MODE}. Must be one of: vllm-sim, vllm, vllm-p2p"
+    exit 1
+    ;;
+esac
 
 # ------------------------------------------------------------------------------
 # Deployment
@@ -32,18 +95,39 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 kubectl create namespace ${NAMESPACE} 2>/dev/null || true
 
 # Hack to deal with KGateways broken OpenShift support
-export PROXY_UID=$(kubectl get namespace ${NAMESPACE} -o json | jq -e -r '.metadata.annotations["openshift.io/sa.scc.uid-range"]' | perl -F'/' -lane 'print $F[0]+1'); 
+export PROXY_UID=$(kubectl get namespace ${NAMESPACE} -o json | jq -e -r '.metadata.annotations["openshift.io/sa.scc.uid-range"]' | perl -F'/' -lane 'print $F[0]+1');
 
 set -o pipefail
 
-echo "INFO: Deploying Development Environment in namespace ${NAMESPACE}"
-
-kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst | kubectl -n ${NAMESPACE} apply -f -
-
-echo "INFO: Waiting for resources in namespace ${NAMESPACE} to become ready"
+if [[ "$CLEAN" == "true" ]]; then
+  echo "INFO: ${CLEAN^^}ING environment in namespace ${NAMESPACE} for mode ${VLLM_MODE}"
+  kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst | kubectl -n "${NAMESPACE}" delete --ignore-not-found=true -f -
+  kustomize build deploy/environments/dev/kubernetes-vllm/${VLLM_MODE} | envsubst | kubectl -n "${NAMESPACE}" delete --ignore-not-found=true -f -
+else
+  echo "INFO: Deploying vLLM Environment in namespace ${NAMESPACE}"
+  oc adm policy add-scc-to-user anyuid -z default -n ${NAMESPACE}
+  kustomize build deploy/environments/dev/kubernetes-vllm/${VLLM_MODE} | envsubst | kubectl -n "${NAMESPACE}" apply -f -
+
+  echo "INFO: Deploying Gateway Environment in namespace ${NAMESPACE}"
+  kustomize build deploy/environments/dev/kubernetes-kgateway | envsubst | kubectl -n "${NAMESPACE}" apply -f -
+
+  echo "INFO: Waiting for resources in namespace ${NAMESPACE} to become ready"
+  kubectl -n "${NAMESPACE}" wait deployment/endpoint-picker --for=condition=Available --timeout=60s
+  kubectl -n "${NAMESPACE}" wait gateway/inference-gateway --for=condition=Programmed --timeout=60s
+  kubectl -n "${NAMESPACE}" wait deployment/inference-gateway --for=condition=Available --timeout=60s
+  # Mode-specific wait
+  case "${VLLM_MODE}" in
+    vllm-sim)
+      kubectl -n "${NAMESPACE}" wait deployment/vllm-sim --for=condition=Available --timeout=60s
+      ;;
+    vllm)
+      kubectl -n "${NAMESPACE}" wait deployment/${VLLM_DEPLOYMENT_NAME} --for=condition=Available --timeout=500s
+      ;;
+    vllm-p2p)
+      kubectl -n "${NAMESPACE}" wait deployment/${VLLM_DEPLOYMENT_NAME} --for=condition=Available --timeout=180s
+      kubectl -n "${NAMESPACE}" wait deployment/${REDIS_SVC_NAME} --for=condition=Available --timeout=60s
+      ;;
+  esac
+fi
 
-kubectl -n ${NAMESPACE} wait deployment/endpoint-picker --for=condition=Available --timeout=60s
-kubectl -n ${NAMESPACE} wait deployment/vllm-sim --for=condition=Available --timeout=60s
-kubectl -n ${NAMESPACE} wait gateway/inference-gateway --for=condition=Programmed --timeout=60s
-kubectl -n ${NAMESPACE} wait deployment/inference-gateway --for=condition=Available --timeout=60s