storage: wal fsync latency histogram different between prometheus and db console #115825

kvoli · 2023-12-07T21:09:08Z

Describe the problem

The storage.wal.fsync.latency metric is different when viewed in grafana (querying prometheus) vs DB Console. The DB Console chart appears to be incorrect as the values are overly smoothed.

To Reproduce

export cluster=austen-fsync-latency-hist-repro
roachprod create $cluster -n 3 --local-ssd=false
roachprod stage $cluster release v23.2.0-beta.1
roachprod start $cluster
roachprod run $cluster -- './cockroach workload run kv --min-block-bytes=4048 --max-block-bytes=4048'
# - open up db-console page and navigate to custom metrics
# - check the custom metric `storage.wal.fsync.latency-p99`
# - check grafana:
histogram_quantile(0.99, sum by (instance,le) (rate(storage_wal_fsync_latency_bucket[15s])))

Expected behavior

DB Console and prometheus are comparable.

Additional data / screenshots

Environment:

CockroachDB version v23.2.0-beta.1
Server OS: Ubuntu
Client app KV workload

Additional context

Fsync metric on DBConsole is not usable. This is unfortunate as it is a relied upon signal for disk saturation.

Jira issue: CRDB-34231

The text was updated successfully, but these errors were encountered:

abarganier · 2023-12-13T18:30:45Z

I was poking around the Histogram code today to do some optimizing for allocations, and I noticed some weirdness that led me here.

I believe #104088 broke the ManualWindowHistogram, which is what's causing this.

When grabbing a window to calculate a quantile against, we are using the cumulative histogram. Shouldn't we just be using the current one?

cockroach/pkg/util/metric/metric.go

Line 669 in 50a47aa

if err := mwh.mu.cum.Write(cur); err != nil {

As of the linked PR, the cur window that's part of the ManualWindowHistogram is written to, but never used.

aadityasondhi · 2023-12-13T19:55:31Z

Yep, I noticed those things too. Putting up a fix soon.

These histograms were incorrectly not setting `mwh.prev`. The `ToPrometheusMetric` was also merging the `cum` with the `prev`. Instead, it should have been using `cur`. Since both `Update()` and `Rotate()` set `cur` to be the correct current window already. Informs cockroachdb#115825. Release note: None

aadityasondhi · 2023-12-20T20:51:11Z

This bug was introduced in 4a2d06a and then was exemplified by some of the changes in bae5045.

There are two main problems (one from each):

The original commit was not windowing the newly introduced histogram using the same logic as the default one does. We have a prev and cur windowed histogram that cover the duration of the histogram. At every tick, we rotate them and we use a merged view to calculate derivative statistics. This is the same as using rate over 1m in Grafana.
The next change incorrectly was merging the cumulative and prev histogram. This is incorrect for obvious reasons and surfaced the problems that we see above.

In addition, both changes were missing tests for the windowing logic for manual histograms.

I made a fix that is within the confines of the current interface. I still strongly believe we should strongly consider taking on #116584 to make this interface easier to use so such bugs are difficult to introduce.

Below is a test I ran using the same workload @kvoli mentioned above.

DB Console:

Grafana:

I will put a PR with these fixes soon.

These histograms were missing the windowing logic we use in histograms to store them in our internal time series DB. The logic we follow there is that we keep 3 histograms. The first one is a cumulative one that we export to prometheus through the `status/vars` page. The second and third are `prev` and `cur` that together cover the entire duration of the histogram, default being 60s. When the ticker rotates these, we swap out the prev with the current, and start a brand new current histogram. When calculating derivitive statistics, we use the combined view of the windows which is used to display these metrics in DB console. This is similar to the way Grafana handles these using its rate interval. Fixes cockroachdb#115825. Release note: None

These histograms were missing the windowing logic we use in histograms to store them in our internal time series DB. The logic we follow there is that we keep 3 histograms. The first one is a cumulative one that we export to prometheus through the `status/vars` page. The second and third are `prev` and `cur` that together cover the entire duration of the histogram, default being 60s. When the ticker rotates these, we swap out the prev with the current, and start a brand new current histogram. When calculating derivative statistics, we use the combined view of the windows which is used to display these metrics in DB console. This is similar to the way Grafana handles these using its rate interval. Fixes cockroachdb#115825. Release note: None

116895: metric: fix windowing for manual histograms r=abarganier,itsbilal a=aadityasondhi These histograms were missing the windowing logic we use in histograms to store them in our internal time series DB. The logic we follow there is that we keep 3 histograms. The first one is a cumulative one that we export to prometheus through the `status/vars` page. The second and third are `prev` and `cur` that together cover the entire duration of the histogram, default being 60s. When the ticker rotates these, we swap out the prev with the current, and start a brand new current histogram. When calculating derivative statistics, we use the combined view of the windows which is used to display these metrics in DB console. This is similar to the way Grafana handles these using its rate interval. Fixes #115825. Release note: None 117097: span: Add strict mode r=miretskiy a=miretskiy Two commits to help debug #116661 1. Log "slow span" messages both from the coordinator and from the aggregators. 2. span: Add strict mode Add strict mode for span frontier implementations that requires the forwarded span to be a sub-span of the spans tracked by the frontier. The functionality is opt-in, and is enabled via `COCKROACH_SPAN_FRONTIER_STRICT_MODE_ENABLED` env var. Enroll cdc roachtests to use strict mode. Informs #116661 Release Note: None Co-authored-by: Aaditya Sondhi <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]>

These histograms were missing the windowing logic we use in histograms to store them in our internal time series DB. The logic we follow there is that we keep 3 histograms. The first one is a cumulative one that we export to prometheus through the `status/vars` page. The second and third are `prev` and `cur` that together cover the entire duration of the histogram, default being 60s. When the ticker rotates these, we swap out the prev with the current, and start a brand new current histogram. When calculating derivative statistics, we use the combined view of the windows which is used to display these metrics in DB console. This is similar to the way Grafana handles these using its rate interval. Fixes cockroachdb#115825. Release note: None

These histograms were missing the windowing logic we use in histograms to store them in our internal time series DB. The logic we follow there is that we keep 3 histograms. The first one is a cumulative one that we export to prometheus through the `status/vars` page. The second and third are `prev` and `cur` that together cover the entire duration of the histogram, default being 60s. When the ticker rotates these, we swap out the prev with the current, and start a brand new current histogram. When calculating derivative statistics, we use the combined view of the windows which is used to display these metrics in DB console. This is similar to the way Grafana handles these using its rate interval. Fixes #115825. Release note: None

blathers-crl bot added the T-storage Storage Team label Dec 7, 2023

aadityasondhi self-assigned this Dec 11, 2023

abarganier added T-observability-inf and removed T-storage Storage Team labels Dec 13, 2023

blathers-crl bot added the A-observability-inf label Dec 13, 2023

abarganier self-assigned this Dec 13, 2023

aadityasondhi mentioned this issue Dec 13, 2023

metric: fix windowing logic for non-rotate manual histograms #116382

Closed

This was referenced Dec 15, 2023

pkg/util/metric: merge all histogram implementations into a single implementation #116584

Open

pkg/util/schedulerlatency: runtimeHistogram implements WindowedHistogram interface when it's not windowed #116690

Open

aadityasondhi mentioned this issue Dec 20, 2023

metric: fix windowing for manual histograms #116895

Merged

craig bot closed this as completed in 4070f26 Jan 3, 2024

exalate-issue-sync bot removed the T-cluster-observability label Jan 3, 2024

exalate-issue-sync bot unassigned aadityasondhi Jan 3, 2024

This was referenced Jan 4, 2024

release-23.2: metric: fix windowing for manual histograms #117333

Merged

release-23.1: metric: fix windowing for manual histograms #117335

Merged

jbowens added this to [Deprecated] Storage Jun 4, 2024

jbowens moved this to Done in [Deprecated] Storage Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: wal fsync latency histogram different between prometheus and db console #115825

storage: wal fsync latency histogram different between prometheus and db console #115825

kvoli commented Dec 7, 2023 •

edited

Loading

abarganier commented Dec 13, 2023

aadityasondhi commented Dec 13, 2023

aadityasondhi commented Dec 20, 2023

storage: wal fsync latency histogram different between prometheus and db console #115825

storage: wal fsync latency histogram different between prometheus and db console #115825

Comments

kvoli commented Dec 7, 2023 • edited Loading

abarganier commented Dec 13, 2023

aadityasondhi commented Dec 13, 2023

aadityasondhi commented Dec 20, 2023

kvoli commented Dec 7, 2023 •

edited

Loading