You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/docs/concepts/remote_write_spec_2_0.md
+60-91Lines changed: 60 additions & 91 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,15 +25,7 @@ The remote write protocol is designed to make stateless implementations of the s
25
25
26
26
The remote write protocol contains opportunities for batching, e.g. sending multiple samples for different series in a single request. It is not expected that multiple samples for the same series will be commonly sent in the same request, although there is support for this in the protocol.
27
27
28
-
<!---
29
-
TODO(bwplotka): Challenge last sentence (e.g. OTel coll usage)?
30
-
-->
31
-
The remote write protocol is not intended for use by applications to push metrics to Prometheus remote-write-compatible Receiver. It is intended that a Prometheus remote-write-compatible sender scrapes instrumented applications or exporters and sends remote write messages to a server.
32
-
33
-
<!---
34
-
TODO(bwplotka): Add support for 2.0 to those suites
35
-
-->
36
-
A test suite can be found at https://github.com/prometheus/compliance/tree/main/remote_write_sender.
28
+
A test suite can be found at https://github.com/prometheus/compliance/tree/main/remote_write_sender. The test's 2.0 compatibility [is in progress](https://github.com/prometheus/compliance/issues/101).
37
29
38
30
### Glossary
39
31
@@ -50,30 +42,28 @@ For the purposes of this document the following definitions MUST be followed:
50
42
51
43
### Protocol
52
44
53
-
The Remote Write Protocol MUST consist of RPCs with the request body encoded using a Google Protobuf 3 message. The protobuf encoding MUST use either of the following schemas:
45
+
The Remote Write Protocol MUST consist of RPCs with the request body encoded using a Google Protobuf 3 message and then compressed.
54
46
55
-
<!---
56
-
TODO(bwplotka): Do we deprecate it?
57
-
-->
58
-
*[`prometheus.WriteRequest`](./remote_write_spec.md#protocol) introduced in the Remote Write 1.0 specification. As of 2.0 the `prometheus.WriteRequest` message is deprecated.
59
-
*`io.prometheus.write.v2.Request` introduced in this specification and defined [below](#ioprometheuswritev2request-proto-schema). Senders SHOULD use `io.prometheus.write.v2.Request` when possible.
47
+
The protobuf encoding MUST use either of the following schemas:
48
+
49
+
*[`prometheus.WriteRequest`](./remote_write_spec.md#protocol) introduced in the Remote Write 1.0 specification. As of 2.0 the `prometheus.WriteRequest` message is deprecated. It SHOULD be used only for compatibility reasons. Receiver MAY NOT support `prometheus.WriteRequest`.
50
+
*`io.prometheus.write.v2.Request` introduced in this specification and defined [below](#ioprometheuswritev2request-proto-schema). Senders and Receivers SHOULD use `io.prometheus.write.v2.Request` when possible. Receiver MUST support `io.prometheus.write.v2.Request`.
51
+
52
+
The encoded message MUST be compressed with [Google’s Snappy](https://github.com/google/snappy). The block format MUST be used -- the framed format MUST NOT be used.
60
53
61
54
Sender MUST send encoded and compresses proto message in the body of an HTTP POST request and send it to the Receiver via HTTP at a provided URL path. The Receiver MAY specify any HTTP URL path to receive metrics.
62
55
63
-
Sender MUST send the following "reserved" headers with the HTTP request:
56
+
Sender MUST send the following reserved headers with the HTTP request:
64
57
65
58
*`Content-Encoding: <compression>`
66
59
67
60
Content encoding request header MUST follow [the RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-content-encoding). Sender MUST use the `snappy` value. More compression algorithms might come in 2.x or beyond.
68
61
69
62
*`Content-Type: application/x-protobuf` or `Content-Type: application/x-protobuf;proto=<fully qualified name>`
70
-
71
-
<!---
72
-
TODO(bwplotka): The framing here is a bit inconsistent with deprecation policy. We can either mention sender MUST use either 1.0 or 2.0 message. Or we can say sender MUST use 2.0 message and it MAY/SHOULD use 1.0 message against 1.x receiver. What do we prefer?
73
-
-->
63
+
74
64
Content type request header MUST follow [the RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-content-type). Sender MUST use `application/x-protobuf` as the only media type. Sender MAY add `;proto=` parameter to the header's value to indicate the fully qualified name of the protobuf message (schema) that was used, from the two mentioned above. As a result, Sender MUST send any of the three supported header values:
75
65
76
-
For the message introduced in PRW 1.0, identified by `prometheus.WriteRequest`:
66
+
For the deprecated message introduced in PRW 1.0, identified by `prometheus.WriteRequest`:
For the message introduced in PRW 2.0, identified by `io.prometheus.write.v2.Request`:
@@ -82,16 +72,11 @@ Sender MUST send the following "reserved" headers with the HTTP request:
82
72
Sender SHOULD use `Content-Type: application/x-protobuf`, for backward compatibility, when talking to 1.x Receiver. Sender SHOULD use `Content-Type: application/x-protobuf;proto=io.prometheus.write.v2.Request` when talking to Receiver supporting 2.x. More proto messages might come in 2.x or beyond.
83
73
84
74
*`User-Agent: <name & version of the sender>`
85
-
*`X-Prometheus-Remote-Write-Version: <remote write specificiation version the sender follows>`
86
-
87
-
Sender SHOULD use `X-Prometheus-Remote-Write-Version: 0.1.0` for backward compatibility, when using 1.0 proto message.
75
+
*`X-Prometheus-Remote-Write-Version: <remote write spec major and minor version>`
88
76
89
-
Sender MAY allow users to send custom HTTP headers; they MUST NOT allow users to configure them in such a way as to send reserved headers.
77
+
Sender SHOULD use `X-Prometheus-Remote-Write-Version: 0.1.0` for backward compatibility, when talking to 1.x Receiver 1.0. Otherwise, Sender SHOULD use the newest remote write version it is compatible with e.g. `X-Prometheus-Remote-Write-Version: 2.0`
90
78
91
-
<!---
92
-
TODO(bwplotka): Kind of repeated statement, feedback welcome how to structure it.
93
-
-->
94
-
The remote write request in the body of the HTTP POST MUST be compressed with [Google’s Snappy](https://github.com/google/snappy). The block format MUST be used -- the framed format MUST NOT be used. The remote write request MUST be encoded using Google Protobuf 3, and MUST use either of the schemas defined above.
79
+
Sender MAY allow users to add custom HTTP headers; they MUST NOT allow users to configure them in such a way as to send reserved headers.
95
80
96
81
#### `io.prometheus.write.v2.Request` Proto Schema
97
82
@@ -106,10 +91,14 @@ The simplified version of the new `io.prometheus.write.v2.Request` is presented
106
91
// Request represents a request to write the given timeseries to a remote destination.
107
92
message Request {
108
93
// symbols contains a de-duplicated array of string elements used for various
109
-
// items in a Request message, like labels and metadata items. To decode
110
-
// each of those items, referenced, by "ref(s)" suffix, you need to lookup the
111
-
// actual string by index from symbols array. The order of strings is up to
112
-
// the client, server should not assume any particular encoding.
94
+
// items in a Request message, like labels and metadata items. For the sender convenience
95
+
// around empty values for optional fields like unit_ref, symbols array MUST start with
96
+
// empty string.
97
+
//
98
+
// To decode each of the symbolized strings, referenced, by "ref(s)" suffix, you
99
+
// need to lookup the actual string by index from symbols array. The order of
100
+
// strings is up to the sender. The receiver should not assume any particular encoding.
101
+
repeated string symbols = 1;
113
102
repeated string symbols = 1;
114
103
// timeseries represents an array of distinct series with 0 or more samples.
115
104
repeated TimeSeries timeseries = 2;
@@ -119,15 +108,15 @@ message Request {
119
108
message TimeSeries {
120
109
// labels_refs is a list of label name-value pair references, encoded
121
110
// as indices to the Request.symbols array. This list's length is always
122
-
// a multiple of two, and the underlying labels should be sorted.
111
+
// a multiple of two, and the underlying labels should be sorted lexicographically.
123
112
//
124
113
// Note that there might be multiple TimeSeries objects in the same
125
114
// Requests with the same labels e.g. for different exemplars, metadata
126
115
// or created timestamp.
127
116
repeated uint32 labels_refs = 1;
128
117
129
118
// Timeseries messages can either specify samples or (native) histogram samples
130
-
// (histogram field), but not both. For typical clients (real-time metric
119
+
// (histogram field), but not both. For typical sender (real-time metric
131
120
// streaming), in healthy cases, there will be only one sample or histogram.
132
121
//
133
122
// Samples and histograms are sorted by timestamp (older first).
@@ -142,7 +131,7 @@ message TimeSeries {
142
131
143
132
// created_timestamp represents an optional created timestamp associated with
144
133
// this series' samples in ms format, typically for counter or histogram type
145
-
// metrics. Note that some servers might require this and in return fail to
134
+
// metrics. Note that some receivers might require this and in return fail to
146
135
// ingest such samples within the Request.
147
136
//
148
137
// For Go, see github.com/prometheus/prometheus/model/timestamp/timestamp.go
@@ -159,7 +148,7 @@ message TimeSeries {
159
148
message Exemplar {
160
149
// labels_refs is a list of label name-value pair references, encoded
161
150
// as indices to the Request.symbols array. This list's len is always
162
-
// a multiple of 2, and the underlying labels should be sorted.
151
+
// a multiple of 2, and the underlying labels should be sorted lexicographically.
163
152
repeated uint32 labels_refs = 1;
164
153
double value = 2;
165
154
int64 timestamp = 3;
@@ -178,7 +167,7 @@ message Sample {
178
167
// Metadata represents the metadata associated with the given series' samples.
179
168
message Metadata {
180
169
enum MetricType {
181
-
METRIC_TYPE_UNSPECIFIED = 0;
170
+
METRIC_TYPE_UNSPECIFIED = 0;
182
171
METRIC_TYPE_COUNTER = 1;
183
172
METRIC_TYPE_GAUGE = 2;
184
173
METRIC_TYPE_HISTOGRAM = 3;
@@ -189,19 +178,20 @@ message Metadata {
189
178
}
190
179
MetricType type = 1;
191
180
// help_ref is a reference to the Request.symbols array representing help
192
-
// text for the metric.
181
+
// text for the metric. Help is optional, reference should point to empty string in
182
+
// such a case.
193
183
uint32 help_ref = 3;
194
184
// unit_ref is a reference to the Request.symbols array representing unit
195
-
// for the metric.
185
+
// for the metric. Unit is optional, reference should point to empty string in
186
+
// such a case.
196
187
uint32 unit_ref = 4;
197
188
}
198
189
199
190
// A native histogram, also known as a sparse histogram.
191
+
// See https://github.com/prometheus/prometheus/blob/remote-write-2.0/prompb/io/prometheus/write/v2/types.proto#L142
192
+
// for a full message that follows the native histogram spec for both sparse
193
+
// and exponential, as well as, custom bucketing.
200
194
message Histogram { ... }
201
-
202
-
// A BucketSpan defines a number of consecutive buckets with their
203
-
// offset.
204
-
message BucketSpan { ... }
205
195
```
206
196
207
197
All timestamps MUST be int64 counted as milliseconds since the Unix epoch. Sample's values MUST be float64.
@@ -210,11 +200,8 @@ For every `TimeSeries` message:
210
200
211
201
* Label references MUST be provided.
212
202
* At least one element in Samples or in Histograms MUST be provided. For series which (rarely) would mix float and histogram samples, a separate `TimeSeries` message MUST be used.
213
-
<!---
214
-
TODO(bwplotka): We have some inconsistency here. In gdoc we talked this should be SHOULD. But proto has it as MUST (?). What do we want at the end? What's wrong with MUST here?
215
-
-->
216
-
* Metadata MUST be provided.
217
-
* Exemplars SHOULD be provided, if they exist.
203
+
* Metadata fields SHOULD be provided.
204
+
* Exemplars SHOULD be provided, if they exist for a series.
218
205
* Created timestamp SHOULD be provided for metrics that follow counter semantics (e.g. counters and histograms).
219
206
220
207
The following subsections define some schema elements in details.
@@ -223,27 +210,20 @@ The following subsections define some schema elements in details.
223
210
224
211
The `io.prometheus.write.v2.Request` proto schema is designed to [intern all strings](https://en.wikipedia.org/wiki/String_interning) for the proven additional compression and memory efficiency gains on top of the standard compressions.
225
212
226
-
Symbols table containing deduplicated strings used in series and exemplar labels, metadata strings MUST be provided. References MUST point to the existing index in the Symbols string array.
213
+
Symbols table containing deduplicated strings used in series and exemplar labels, metadata strings MUST be provided. The first element of symbols table MUST be an empty string. References MUST point to the existing index in the Symbols string array.
227
214
228
215
#### Series Labels
229
216
230
217
The complete set of labels MUST be sent with each Sample or Histogram sample. Additionally, the label set associated with samples:
231
218
232
-
- SHOULD contain a `__name__` label.
233
-
- MUST NOT contain repeated label names.
234
-
- MUST have label names sorted in lexicographical order.
235
-
- MUST NOT contain any empty label names or values.
219
+
* SHOULD contain a `__name__` label.
220
+
* MUST NOT contain repeated label names.
221
+
* MUST have label names sorted in lexicographical order.
222
+
* MUST NOT contain any empty label names or values.
236
223
237
-
Sender MUST only send valid metric names, label names, and label values:
224
+
Metric names, label names, and label values MAY be any sequence of UTF-8 characters. Receiver MAY reject some series with metric names and label names characters that does not follow [previous patterns](https://prometheus.io/docs/concepts/remote_write_spec/#:~:text=Metric%20names%20MUST,UTF%2D8%20characters%20) given [the UTF-8 support is still in progress](https://github.com/prometheus/proposals/blob/main/proposals/2023-08-21-utf8.md).
238
225
239
-
<!---
240
-
TODO(bwplotka): Add mention for UTF-8 support going forward, see https://docs.google.com/document/d/1PljkX3YLLT-4f7MqrLt7XCVPG3IsjRREzYrUzBxCPV0/edit?disco=AAAA4gSqQ7g
241
-
-->
242
-
- Metric names MUST adhere to the regex `[a-zA-Z_:]([a-zA-Z0-9_:])*`.
243
-
- Label names MUST adhere to the regex `[a-zA-Z_]([a-zA-Z0-9_])*`.
244
-
- Label values MAY be any sequence of UTF-8 characters .
245
-
246
-
Receiver MAY impose limits on the number and length of labels, but this will be receiver-specific and is out of scope for this document.
226
+
Receiver also MAY impose limits on the number and length of labels, but this is receiver-specific and is out of scope for this document.
247
227
248
228
Label names beginning with "__" are RESERVED for system usage and SHOULD NOT be used, see [Prometheus Data Model](https://prometheus.io/docs/concepts/data_model/).
249
229
@@ -271,10 +251,11 @@ Metadata SHOULD follow the official guidelines for [TYPE](https://prometheus.io/
271
251
272
252
#### Exemplars
273
253
274
-
<!---
275
-
TODO(bwplotka): Anything to say here?
276
-
-->
277
-
TBD
254
+
Each exemplar, if attached to a `TimeSeries`:
255
+
256
+
* MUST contain at least one label set, so two references to symbols table.
257
+
* MUST contain value.
258
+
* MAY contain timestamp.
278
259
279
260
#### Created Timestamp
280
261
@@ -291,7 +272,7 @@ The following subsections specify Sender and Receiver semantics around write err
291
272
292
273
#### Partial Write
293
274
294
-
Sender SHOULD use Prometheus Remote Write to request write of multiple samples, across multiple series. As a result, Receiver MAY ingest valid samples within a write request that contains invalid or otherwise unwritten samples, which represents a partial write case.
275
+
Sender SHOULD use Prometheus Remote Write to send samples for multiple series in a single request. As a result, Receiver MAY ingest valid samples within a write request that contains invalid or otherwise unwritten samples, which represents a partial write case.
295
276
296
277
In a partial write case, Receiver MUST NOT return HTTP 200 status code. Receiver MUST provide a human-readable error message in the response body. Sender MUST NOT try and interpret the error message, and SHOULD log it as is.
297
278
@@ -301,17 +282,9 @@ Receiver MAY NOT support certain content types or encodings defined in [the Prot
301
282
302
283
Sender SHOULD expect [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9110.html#name-400-bad-request) for the above reasons from the 1.x Receiver, for backward compatibility.
303
284
304
-
<!---
305
-
TODO(bwplotka): Note sure if worth mentioning given we decided to not include auto negotiation logic in 2.0. I think I would delete it.
306
-
-->
307
-
Sender MAY retry write requests on 415 HTTP status code, with different content type and compression settings.
308
-
309
285
#### Invalid Samples
310
286
311
-
<!---
312
-
TODO(bwplotka): This wording assumes metadata is optional, which I think it should NOT be.
313
-
-->
314
-
Receiver MAY NOT support certain metric types or samples (e.g. Receiver might reject sample without metadata or without created timestamp, while another Receiver might accept such sample.). It’s up to the Receiver what sample is invalid. Receiver MUST return a [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9110.html#name-400-bad-request) status code for write requests that contain any invalid samples, unless the [partial retryable write](#retries-on-partial-writes) occurs.
287
+
Receiver MAY NOT support certain metric types or samples (e.g. Receiver might reject sample without metadata type specified or without created timestamp, while another Receiver might accept such sample.). It’s up to the Receiver what sample is invalid. Receiver MUST return a [400 HTTP Bad Request](https://www.rfc-editor.org/rfc/rfc9110.html#name-400-bad-request) status code for write requests that contain any invalid samples, unless the [partial retryable write](#retries-on-partial-writes) occurs.
315
288
316
289
Sender MUST NOT retry on 4xx HTTP (other than 429 and 415) status codes, which MUST be used by Receiver to indicate that the write will never be able to succeed and should not be retried.
317
290
@@ -329,27 +302,23 @@ No partial retry-ability is specified (ability for receiver to ask for retry on
329
302
330
303
Similarly, Receiver MAY return a HTTP 5xx or 429 status code on partial write or [partial invalid sample cases](#partial-write), when it expects Sender to retry the whole request.
331
304
332
-
### Backward and forward compatibility
305
+
### Backward and Forward Compatibility
333
306
334
-
TBD
307
+
The protocol follows [semantic versioning 2.0](https://semver.org/): any 2.x compatible Receiver MUST be able to read any 2.x compatible sender and so on. Breaking/backwards incompatible changes will result in a 3.x version of the spec.
335
308
336
-
<!---
337
-
TODO(bwplotka): TBD, below is copy of 1.x
338
-
The protocol follows [semantic versioning 2.0](https://semver.org/): any 1.x compatible Receiver MUST be able to read any 1.x compatible sender and so on. Breaking/backwards incompatible changes will result in a 2.x version of the spec.
309
+
The proto formats itself are forward / backward compatible, in some respects:
339
310
340
-
The proto format itself is forward / backward compatible, in some respects:
311
+
* Removing fields from the proto requirements mean a major version bump.
312
+
* Adding (optional) fields will be a minor version bump.
341
313
342
-
- Removing fields from the proto will mean a major version bump.
343
-
- Adding (optional) fields will be a minor version bump.
314
+
In other words, this means that future minor versions of 2.x MAY add new optional fields to `io.prometheus.write.v2.Request`, new compressions, content types (wire formats) and negotiation mechanisms, as long as they are backward compatible (e.g. optional to both Receivers and Senders).
344
315
345
-
Negotiation:
316
+
### 2.x vs 1.x Compatibility
346
317
347
-
- Sender MUST send the version number in a headers.
348
-
- Receiver MAY return the highest version number they support in a response header ("X-Prometheus-Remote-Write-Version").
349
-
- Sender who wish to send in a format >1.x MUST start by sending an empty 1.x, and see if the response says the receiver supports something else. The Sender MAY use any supported version . If there is no version header in the response, Sender MUST assume 1.x compatibility only.
350
-
-->
318
+
The 2.x protocol is breaking compatibility with 1.x by introducing a new `io.prometheus.write.v2.Request` content type (wire format) and deprecating the `prometheus.WriteRequest`.
319
+
320
+
2.x senders MAY support 1.x... TBD explain.
351
321
352
-
Receiver MAY ingest valid samples within a write request that otherwise contains invalid samples. Receiver MUST return a HTTP 400 status code ("Bad Request") for write requests that contain any invalid samples. Receiver SHOULD provide a human readable error message in the response body. Sender MUST NOT try and interpret the error message, and SHOULD log it as is.
0 commit comments