diff --git a/mkdocs.yml b/mkdocs.yml
index a024c16d..8cd3f3fb 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -57,6 +57,7 @@ nav:
- User Guides:
- Getting started: guides/index.md
- Adapter Rollout: guides/adapter-rollout.md
+ - Metrics: guides/metrics.md
- Implementer's Guide: guides/implementers.md
- Reference:
- API Reference: reference/spec.md
diff --git a/pkg/epp/metrics/README.md b/site-src/guides/metrics.md
similarity index 51%
rename from pkg/epp/metrics/README.md
rename to site-src/guides/metrics.md
index 1f68a0bd..f793734d 100644
--- a/pkg/epp/metrics/README.md
+++ b/site-src/guides/metrics.md
@@ -1,10 +1,6 @@
-# Documentation
+# Metrics
-This documentation is the current state of exposed metrics.
-
-## Table of Contents
-* [Exposed Metrics](#exposed-metrics)
-* [Scrape Metrics](#scrape-metrics)
+This guide describes the current state of exposed metrics and how to scrape them.
## Requirements
@@ -38,17 +34,17 @@ spec:
## Exposed metrics
-| Metric name | Metric Type | Description | Labels | Status |
-| ------------|--------------| ----------- | ------ | ------ |
-| inference_model_request_total | Counter | The counter of requests broken out for each model. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_request_error_total | Counter | The counter of requests errors broken out for each model. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_request_duration_seconds | Distribution | Distribution of response latency. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_request_sizes | Distribution | Distribution of request size in bytes. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_response_sizes | Distribution | Distribution of response size in bytes. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_input_tokens | Distribution | Distribution of input token count. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_model_output_tokens | Distribution | Distribution of output token count. | `model_name`=<model-name>
`target_model_name`=<target-model-name> | ALPHA |
-| inference_pool_average_kv_cache_utilization | Gauge | The average kv cache utilization for an inference server pool. | `name`=<inference-pool-name> | ALPHA |
-| inference_pool_average_queue_size | Gauge | The average number of requests pending in the model server queue. | `name`=<inference-pool-name> | ALPHA |
+| **Metric name** | **Metric Type** |