feat(om2): add native histograms to OpenMetrics2.0 #2634

krajorama · 2025-04-24T15:04:47Z

Background

Based on https://github.com/prometheus/proposals/blob/main/proposals/2024-01-29_native_histograms_text_format.md
And OpenMetrics 2.0 WG discussions.

Changes

Allow structured complex types marked by "{" and "}" in the specification.
Allow multiple exemplars per complex type value.
Require that exemplars for complex type values have the timestamp.
Be permissive about observing NaN , +Inf, -Inf. Discourage observing NaN.
Split histogram into ones with classic and native buckets.
For classic buckets, define behavior when observing NaN.
Define the native buckets and also how NaN is handled.
Define the text format of native histograms and also their exemplars.

Open questions / decisions

See OpenMetrics2.0 WG meeting notes tab

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-04-30T12:39:25Z

content/docs/specs/om/open_metrics_spec_2_0.md

+
+Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.
+
+Complex data types MUST contain all information necessary to recreate a Metric Type, with the exception of Created time and Exemplars.


This assume we'll have the created timestamp separate from the JSON like data. prometheus/OpenMetrics#285

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-05-06T14:58:40Z

Note self: add details how summaries and classic histograms one liners (so not NHCB spans/deltas) fit into it.

Signed-off-by: György Krajcsovits <[email protected]>

Add semantic conventions about where complex types may occure. Allow empty spans and deltas. Be more precise. Signed-off-by: György Krajcsovits <[email protected]>

Signed-off-by: György Krajcsovits <[email protected]>

ArthurSens

Had some minutes to read the PR. Couldn't read everything though, I'll do a more complete review another day!

Reviewed from mobile

content/docs/specs/om/open_metrics_spec_2_0.md

ArthurSens · 2025-05-24T12:47:16Z

content/docs/specs/om/open_metrics_spec_2_0.md

+
+##### Exponential buckets
+
+Histogram MetricPoints with exponential buckets MUST have a Schema value. The Schema is an 8 bit signed integer between -4 and 127. Schema values between -4 and 127 are also called Standard Schemas.


Carrie's document mentions that schema goes from -4 up to 8. I'm a bit confused here, is it 8 or 127?

I wanted to keep this open ended, 8 is the current implementation maximum. This way we reserve higher resolutions.

content/docs/specs/om/open_metrics_spec_2_0.md

Signed-off-by: György Krajcsovits <[email protected]>

docs/specs/om/open_metrics_spec_2_0.md

I did not want to clutter the examples with adding "# EOF" to all, so I made a new marker to explicitly add it on request. There was only one faulty example where we had an extra space, see the Exemplars section. Signed-off-by: György Krajcsovits <[email protected]>

Allow multiple exemplars for complex types, i.e. native histograms. But require that the timestamp is present. Signed-off-by: György Krajcsovits <[email protected]>

beorn7

Thanks for doing all of this.

Mostly commented about the histogram aspect (but I couldn't resist and referred to OM specific things now and then).

beorn7 · 2025-06-12T22:32:22Z

docs/specs/om/open_metrics_spec_2_0.md

+- Integer counter native histograms for the Metric Type Histogram.
+- Integer gauge native histograms for the Metric Type GaugeHistogram.


Are float histograms planned for later? Or are they excluded for good? (Note that we need float histograms if we want to support federation via OM text format.)

Haven't thought about it. There's two options: allow floats everywhere and use absolute numbers for buckets similar to how we do regular float counters. Which means we'd have a single exposition format for both.

Or introduce a new kind of exposition to be detected from the fields that occur (deltas vs buckets).

In general this is a somewhat separate question as OpenMetric1.0 didn't allow floats. But we need to prepare for sure.

update: moved discussion to the PR doc

docs/specs/om/open_metrics_spec_2_0.md

beorn7 · 2025-06-12T22:45:04Z

docs/specs/om/open_metrics_spec_2_0.md

+
+If the NaN value is not allowed, then the Count value MUST be equal to the value of the +Inf bucket.
+
+If the NaN value is allowed, it SHOULD be counted in the +Inf bucket, and MUST not be counted in any other bucket. In case the NaN is counted in the +Inf bucket, then the Count MUST be equal to the value of the +Inf bucket, otherwise the Count MUST be greater. The ratonale is that NaN does not belong to any bucket mathematically, however instrumentation libraries traditionally put it into the +Inf bucket, which is why the wording "SHOULD" is used.


🤔 Since there are several code locations with the baked-in assumption that the +Inf bucket is identical to the count (e.g. only storing one value, feeding into both of these), I don't think we can relax the requirement here without causing trouble.

Since it doesn't really matter in practice how we treat NaNs (they will only happen because of bugs), I would propose to keep the current way, even if it is mathematically somewhat inconsistent.

Another point for that is that we don't get rid of the old way by allowing the new way. We now have to deal with two different ways.

Version 1.0 does not say anything, but I'll change SHOULD to MUST and rework. https://prometheus.io/docs/specs/om/open_metrics_spec/#histogram

docs/specs/om/open_metrics_spec_2_0.md

beorn7 · 2025-06-12T23:26:04Z

docs/specs/om/open_metrics_spec_2_0.md


-The bucket and Gsum of a GaugeHistogram are conceptually gauges, however bucket values MUST NOT be negative or NaN. If negative threshold buckets are present, then sum MAY be negative. Gsum MUST NOT be NaN. Bucket values MUST be integers.
+The bucket and Gsum of a GaugeHistogram are conceptually gauges, however bucket values MUST NOT be negative or NaN. If negative threshold buckets are present, then Gsum MAY be negative. Gsum MUST NOT be NaN. Bucket values MUST be integers.

 A GaugeHistogram's Metric's LabelSet MUST NOT have a "le" label name.


Unless it's only native buckets.

beorn7 · 2025-06-12T23:26:30Z

docs/specs/om/open_metrics_spec_2_0.md


 A GaugeHistogram's Metric's LabelSet MUST NOT have a "le" label name.

-Bucket values can have exemplars.
+The buckets for a GaugeHistogram follow all the same rules as for a Histogram, with Gcount playing the same role as Count.


And Gsum as Sum?

beorn7 · 2025-06-12T23:29:07Z

docs/specs/om/open_metrics_spec_2_0.md

+
+Histograms with exponential buckets use the integer native histogram data type.
+
+The integer native histogram data type is a JSON like structure with fields. There MUST NOT be any whitespace around fields.


It's not really JSON. Many language use vaguely this kind of structure. I wouldn't call out JSON because it creates the expectation that other JSON syntax rules also apply.

There MUST NOT be any whitespace around fields.

Has it already been decided that OMv2 will stay whitespace intolerant?

It's one of my big big red flags with OMv1 that it doesn't tolerate whitespace like the classic Prometheus text format. If OMv2 will see the light and allow whitespace again, then of course we should allow whitespace here.

We have not discussed this in OM2.0 explicitly. Comes from https://github.com/prometheus/proposals/blob/main/proposals/2024-01-29_native_histograms_text_format.md.

However we did say that since native histograms use a structure that's essentially internal, we kind of already given up on being very human friendly. I still keep the count and sum fields up front because that's easy to understand and relates to PromQL.

A consequence is that I think we'll have to give some tooling (like promtool) for people to pretty print the output and also instrumentations may allow pretty printing themselves.

beorn7 · 2025-06-12T23:31:43Z

docs/specs/om/open_metrics_spec_2_0.md

+
+Exponential bucket values MUST be ordered by their index, and their values MUST be placed in the `negative_deltas` (and/or `positive_deltas`) field using delta encoding, that is the first bucket value is written as is and the following values only as a delta relative to the previous value. For example bucket values 1, 5, 4, 4 will become 1, 4, -1, 0.
+
+To map the `negative_deltas` (and/or `positive_deltas`) back to their indices, the `negative_spans` (and/or `positive_spans`) field MUST be constructed in the following way: each span consists of a pair of numbers, an integer called offset and an non-negative integer called length. Only the first span in each list can have a negative offset. It defines the index of the first bucket in its corresponding `negative_deltas` (and/or `positive_deltas`). The length defines the number of consecutive buckets the bucket list starts with. The offsets of the following spans define the number of excluded (and thus unpopulated buckets). The lengths define the number of consecutive buckets in the list following the excluded buckets.


Capitalize after :: each → Each

beorn7 · 2025-06-12T23:34:31Z

docs/specs/om/open_metrics_spec_2_0.md

 ###### Exemplars

 Exemplars without Labels MUST represent an empty LabelSet as {}.

 An example of Exemplars showcasing several valid cases:
+The native histogram version of the histogram has multiple Exemplars.


But I see only one?

There's two

foo {count:10,sum:1.0,schema:0,zero_threshold:1e-4,zero_count:0,positive_spans:[0:2],positive_deltas:[5,0]} # {trace_id="shaZ8oxi"} 0.67 1520879607.789 # {trace_id="ookahn0M"} 1.2 1520879608.589

Signed-off-by: György Krajcsovits <[email protected]>

even unpopulated ones. Signed-off-by: György Krajcsovits <[email protected]>

Empty may be exposed for optimizing spans. Signed-off-by: György Krajcsovits <[email protected]>

Signed-off-by: György Krajcsovits <[email protected]>

krajorama · 2025-06-13T15:03:46Z

docs/specs/om/open_metrics_spec_2_0.md

+The values of exemplars in a Histogram MetricPoint with native buckets MUST fall into one of the native buckets.
+


I think I'll remove this, since:

if the exemplar's value is not NaN then definition some native bucket will cover it

if the exemplar's value is NaN this is not true

after reset the actual stored bucket may be missing, but the exemplar can be there.

Suggested change

The values of exemplars in a Histogram MetricPoint with native buckets MUST fall into one of the native buckets.

krajorama added 4 commits April 24, 2025 17:04

feat(om2): add native histograms to OpenMetrics2.0

ddbace4

Signed-off-by: György Krajcsovits <[email protected]>

define the counter histogram model

2f4113f

Signed-off-by: György Krajcsovits <[email protected]>

small fixes to histogram model

a47ce0c

Signed-off-by: György Krajcsovits <[email protected]>

update gauge histogram model

10dc43d

Signed-off-by: György Krajcsovits <[email protected]>

krajorama commented Apr 30, 2025

View reviewed changes

wip: wip

d12dfeb

Signed-off-by: György Krajcsovits <[email protected]>

krajorama mentioned this pull request May 8, 2025

histograms: Implement final OM text format prometheus/prometheus#11265

Open

krajorama added 4 commits May 22, 2025 10:05

adjust wording

471a7dd

Signed-off-by: György Krajcsovits <[email protected]>

add abnf and presentation

ec991a4

Signed-off-by: György Krajcsovits <[email protected]>

updates

dc308e2

Add semantic conventions about where complex types may occure. Allow empty spans and deltas. Be more precise. Signed-off-by: György Krajcsovits <[email protected]>

Add gauge histogram syntax

57c6ab8

Signed-off-by: György Krajcsovits <[email protected]>

ArthurSens reviewed May 24, 2025

View reviewed changes

beorn7 self-requested a review May 27, 2025 11:26

krajorama added 2 commits May 27, 2025 15:53

clarify NaN and Inf and be permissive

32a65cf

Signed-off-by: György Krajcsovits <[email protected]>

fix from Arthur's comments

6968e00

Signed-off-by: György Krajcsovits <[email protected]>

krajorama marked this pull request as ready for review May 27, 2025 14:01

Merge branch 'main' into krajo/om2.0-native-histograms

d95eed7

This comment was marked as outdated.

Sign in to view

krajorama commented Jun 3, 2025

View reviewed changes

docs/specs/om/open_metrics_spec_2_0.md Outdated Show resolved Hide resolved

krajorama mentioned this pull request Jun 4, 2025

OM text exposition for NH prometheus/client_python#1087

Open

krajorama added 2 commits June 11, 2025 09:03

Update with exemplars

7965787

Allow multiple exemplars for complex types, i.e. native histograms. But require that the timestamp is present. Signed-off-by: György Krajcsovits <[email protected]>

krajorama force-pushed the krajo/om2.0-native-histograms branch from c16426b to 7965787 Compare June 11, 2025 07:27

Merge branch 'main' into krajo/om2.0-native-histograms

0a944c4

krajorama mentioned this pull request Jun 12, 2025

OM 2.0: Native Histogram Support in Text format. prometheus/OpenMetrics#279

Open

beorn7 requested changes Jun 12, 2025

View reviewed changes

krajorama added 3 commits June 13, 2025 14:37

fix schema number allocation

52bd9e9

Signed-off-by: György Krajcsovits <[email protected]>

Only prohibit "le" label if there are classic buckets

2b0fdcc

Signed-off-by: György Krajcsovits <[email protected]>

Make the requirement around NaN stronger in classic case.

1dea11f

Signed-off-by: György Krajcsovits <[email protected]>

krajorama added 6 commits June 13, 2025 15:03

Fix classic bucket choice for exemplar

986380d

Signed-off-by: György Krajcsovits <[email protected]>

Use native buckets instead of exponential buckets

3b9e078

Signed-off-by: György Krajcsovits <[email protected]>

Define choice of classic vs native in terms of measurement

bd82e1e

Signed-off-by: György Krajcsovits <[email protected]>

Require that all classic buckets are exposed in the text format

cbfe910

even unpopulated ones. Signed-off-by: György Krajcsovits <[email protected]>

Specify that empty native buckets should not be present or exposed

16b4c50

Empty may be exposed for optimizing spans. Signed-off-by: György Krajcsovits <[email protected]>

Mention reset when bucket count is too high

ca1d860

Signed-off-by: György Krajcsovits <[email protected]>

krajorama commented Jun 13, 2025

View reviewed changes


		Numbers MUST be either floating points or integers. Note that ingestors of the format MAY only support float64. The non-real values NaN, +Inf and -Inf MUST be supported. NaN MUST NOT be considered a missing value, but it MAY be used to signal a division by zero.

		Complex data types MUST contain all information necessary to recreate a Metric Type, with the exception of Created time and Exemplars.


		##### Exponential buckets

		Histogram MetricPoints with exponential buckets MUST have a Schema value. The Schema is an 8 bit signed integer between -4 and 127. Schema values between -4 and 127 are also called Standard Schemas.

		- Integer counter native histograms for the Metric Type Histogram.
		- Integer gauge native histograms for the Metric Type GaugeHistogram.


		If the NaN value is not allowed, then the Count value MUST be equal to the value of the +Inf bucket.

		If the NaN value is allowed, it SHOULD be counted in the +Inf bucket, and MUST not be counted in any other bucket. In case the NaN is counted in the +Inf bucket, then the Count MUST be equal to the value of the +Inf bucket, otherwise the Count MUST be greater. The ratonale is that NaN does not belong to any bucket mathematically, however instrumentation libraries traditionally put it into the +Inf bucket, which is why the wording "SHOULD" is used.


		Histograms with exponential buckets use the integer native histogram data type.

		The integer native histogram data type is a JSON like structure with fields. There MUST NOT be any whitespace around fields.


		Exponential bucket values MUST be ordered by their index, and their values MUST be placed in the `negative_deltas` (and/or `positive_deltas`) field using delta encoding, that is the first bucket value is written as is and the following values only as a delta relative to the previous value. For example bucket values 1, 5, 4, 4 will become 1, 4, -1, 0.

		To map the `negative_deltas` (and/or `positive_deltas`) back to their indices, the `negative_spans` (and/or `positive_spans`) field MUST be constructed in the following way: each span consists of a pair of numbers, an integer called offset and an non-negative integer called length. Only the first span in each list can have a negative offset. It defines the index of the first bucket in its corresponding `negative_deltas` (and/or `positive_deltas`). The length defines the number of consecutive buckets the bucket list starts with. The offsets of the following spans define the number of excluded (and thus unpopulated buckets). The lengths define the number of consecutive buckets in the list following the excluded buckets.

		The values of exemplars in a Histogram MetricPoint with native buckets MUST fall into one of the native buckets.

feat(om2): add native histograms to OpenMetrics2.0 #2634

Are you sure you want to change the base?

feat(om2): add native histograms to OpenMetrics2.0 #2634

Conversation

krajorama commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Open questions / decisions

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krajorama commented May 6, 2025

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

beorn7 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krajorama Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krajorama commented Apr 24, 2025 •

edited

Loading

krajorama Jun 13, 2025 •

edited

Loading