Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Granular Metrics for upstream_addr and upstream_response_time #14374

Open
JDarzan opened this issue Mar 19, 2025 · 0 comments
Open

Missing Granular Metrics for upstream_addr and upstream_response_time #14374

JDarzan opened this issue Mar 19, 2025 · 0 comments

Comments

@JDarzan
Copy link

JDarzan commented Mar 19, 2025

I am using Kong API Gateway Enterprise (via Konnect Hybrid mode) and Self-Managed servers.

My goal is to monitor individual targets within an upstream to diagnose performance bottlenecks. However, I have realized that there is no native way in Kong to capture detailed metrics for each specific target within an upstream. Both in Datadog and Prometheus, I can see aggregated upstream metrics, but I cannot obtain per-target granularity.

Currently, I can access metrics such as kong.http.requests.count, kong.upstream.latency.ms.bucket, kong.upstream.latency.ms.count, and kong.upstream.latency.ms.sum, as well as general latency metrics like kongdd.upstream_latency.avg and kongdd.upstream_latency.max
However, all these metrics refer to the upstream as a whole, not to each individual target. The only per-target information available in Prometheus is the kong_upstream_target_health metric, which only indicates whether the target is healthy or unhealthy, without any visibility into the number of requests received or the individual response time.

In Kong’s access logs, both upstream_addr and upstream_response_time appear correctly, confirming that Kong knows which target was used and how long it took to respond. However, there is no native way to convert this information into metrics consumable by Datadog or Prometheus. I have attempted multiple approaches, such as modifying the Datadog plugin to include upstream_addr as a tag, creating a Kong Post-Function plugin to add an X-Upstream-Addr header to the response, and even storing the information in kong.ctx.shared within a Pre-Function plugin, but in all cases, the values were not available in the logging phase.

Since Kong already has internal access to upstream_addr and upstream_response_time, why is it so difficult to expose them as metrics? The lack of this granularity makes it challenging to monitor individual targets within an upstream, preventing precise identification of specific instances that may be causing performance issues. Is there a technical limitation preventing this feature from being implemented, or is there a recommended approach to efficiently work around this problem within Kong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant