Skip to content

Commit 1dbe402

Browse files
committed
initial overrides api draft
Signed-off-by: Bogdan Stancu <[email protected]>
1 parent aad1fee commit 1dbe402

File tree

11 files changed

+1740
-1
lines changed

11 files changed

+1740
-1
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
* [CHANGE] StoreGateway/Alertmanager: Add default 5s connection timeout on client. #6603
55
* [CHANGE] Ingester: Remove EnableNativeHistograms config flag and instead gate keep through new per-tenant limit at ingestion. #6718
66
* [CHANGE] Validate a tenantID when to use a single tenant resolver. #6727
7+
* [FEATURE] Add new Overrides API module. Rename old overrides module to overrides-configs.
78
* [FEATURE] Distributor: Add an experimental `-distributor.otlp.allow-delta-temporality` flag to ingest delta temporality otlp metrics. #6934
89
* [FEATURE] Query Frontend: Add dynamic interval size for query splitting. This is enabled by configuring experimental flags `querier.max-shards-per-query` and/or `querier.max-fetched-data-duration-per-query`. The split interval size is dynamically increased to maintain a number of shards and total duration fetched below the configured values. #6458
910
* [FEATURE] Querier/Ruler: Add `query_partial_data` and `rules_partial_data` limits to allow queries/rules to be evaluated with data from a single zone, if other zones are not available. #6526

docs/api/_index.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ For the sake of clarity, in this document we have grouped API endpoints by servi
6666
| [Delete Alertmanager configuration](#delete-alertmanager-configuration) | Alertmanager || `DELETE /api/v1/alerts` |
6767
| [Tenant delete request](#tenant-delete-request) | Purger || `POST /purger/delete_tenant` |
6868
| [Tenant delete status](#tenant-delete-status) | Purger || `GET /purger/delete_tenant_status` |
69+
| [Get user overrides](#get-user-overrides) | Overrides || `GET /api/v1/user-overrides` |
70+
| [Set user overrides](#set-user-overrides) | Overrides || `PUT /api/v1/user-overrides` |
71+
| [Delete user overrides](#delete-user-overrides) | Overrides || `DELETE /api/v1/user-overrides` |
6972
| [Store-gateway ring status](#store-gateway-ring-status) | Store-gateway || `GET /store-gateway/ring` |
7073
| [Compactor ring status](#compactor-ring-status) | Compactor || `GET /compactor/ring` |
7174
| [Get rule files](#get-rule-files) | Configs API (deprecated) || `GET /api/prom/configs/rules` |
@@ -872,6 +875,64 @@ Returns status of tenant deletion. Output format to be defined. Experimental.
872875

873876
_Requires [authentication](#authentication)._
874877

878+
## Overrides
879+
880+
The Overrides service provides an API for managing user overrides.
881+
882+
### Get user overrides
883+
884+
```
885+
GET /api/v1/user-overrides
886+
```
887+
888+
Get the current overrides for the authenticated tenant. Returns the overrides in JSON format.
889+
890+
_Requires [authentication](#authentication)._
891+
892+
### Set user overrides
893+
894+
```
895+
PUT /api/v1/user-overrides
896+
```
897+
898+
Set or update overrides for the authenticated tenant. The request body should contain a JSON object with the override values.
899+
900+
_Requires [authentication](#authentication)._
901+
902+
### Delete user overrides
903+
904+
```
905+
DELETE /api/v1/user-overrides
906+
```
907+
908+
Delete all overrides for the authenticated tenant. This will revert the tenant to using default values.
909+
910+
_Requires [authentication](#authentication)._
911+
912+
#### Example request body for PUT
913+
914+
```json
915+
{
916+
"ingestion_rate": 50000,
917+
"max_global_series_per_user": 1000000,
918+
"ruler_max_rules_per_rule_group": 100
919+
}
920+
```
921+
922+
#### Supported limits
923+
924+
The following limits can be modified via the API:
925+
- `max_global_series_per_user`
926+
- `max_global_series_per_metric`
927+
- `ingestion_rate`
928+
- `ingestion_burst_size`
929+
- `ruler_max_rules_per_rule_group`
930+
- `ruler_max_rule_groups_per_tenant`
931+
932+
#### Hard limits
933+
934+
Overrides are validated against hard limits defined in the runtime configuration file. If a requested override exceeds the hard limit for the tenant, the request will be rejected with a 400 status code.
935+
875936
## Store-gateway
876937

877938
### Store-gateway ring status

docs/configuration/config-file-reference.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5465,6 +5465,257 @@ thanos_engine:
54655465
[optimizers: <string> | default = "default"]
54665466
```
54675467
5468+
### `overrides`
5469+
5470+
The `overrides` configures the Cortex overrides API for managing user overrides.
5471+
5472+
```yaml
5473+
# Enable the overrides module.
5474+
# CLI flag: -overrides.enabled
5475+
[enabled: <boolean> | default = false]
5476+
5477+
# Path to the runtime configuration file.
5478+
# CLI flag: -overrides.runtime-config-file
5479+
[runtime_config_file: <string> | default = "runtime.yaml"]
5480+
5481+
# Backend storage to use. Supported backends are: s3, gcs, azure, swift.
5482+
# CLI flag: -overrides.backend
5483+
[backend: <string> | default = "s3"]
5484+
5485+
s3:
5486+
# The S3 bucket endpoint. It could be an AWS S3 endpoint listed at
5487+
# https://docs.aws.amazon.com/general/latest/gr/s3.html or the address of an
5488+
# S3-compatible service in hostname:port format.
5489+
# CLI flag: -overrides.s3.endpoint
5490+
[endpoint: <string> | default = ""]
5491+
5492+
# S3 region. If unset, the client will issue a S3 GetBucketLocation API call
5493+
# to autodetect it.
5494+
# CLI flag: -overrides.s3.region
5495+
[region: <string> | default = ""]
5496+
5497+
# S3 bucket name
5498+
# CLI flag: -overrides.s3.bucket-name
5499+
[bucket_name: <string> | default = ""]
5500+
5501+
# S3 secret access key
5502+
# CLI flag: -overrides.s3.secret-access-key
5503+
[secret_access_key: <string> | default = ""]
5504+
5505+
# S3 access key ID
5506+
# CLI flag: -overrides.s3.access-key-id
5507+
[access_key_id: <string> | default = ""]
5508+
5509+
# If enabled, use http:// for the S3 endpoint instead of https://. This could
5510+
# be useful in local dev/test environments while using an S3-compatible
5511+
# backend storage, like Minio.
5512+
# CLI flag: -overrides.s3.insecure
5513+
[insecure: <boolean> | default = false]
5514+
5515+
# The signature version to use for authenticating against S3. Supported values
5516+
# are: v4, v2.
5517+
# CLI flag: -overrides.s3.signature-version
5518+
[signature_version: <string> | default = "v4"]
5519+
5520+
# The s3 bucket lookup style. Supported values are: auto, virtual-hosted,
5521+
# path.
5522+
# CLI flag: -overrides.s3.bucket-lookup-type
5523+
[bucket_lookup_type: <string> | default = "auto"]
5524+
5525+
# If true, attach MD5 checksum when upload objects and S3 uses MD5 checksum
5526+
# algorithm to verify the provided digest. If false, use CRC32C algorithm
5527+
# instead.
5528+
# CLI flag: -overrides.s3.send-content-md5
5529+
[send_content_md5: <boolean> | default = true]
5530+
5531+
http:
5532+
# The time an idle connection will remain idle before closing.
5533+
# CLI flag: -overrides.s3.http.idle-conn-timeout
5534+
[idle_conn_timeout: <duration> | default = 1m30s]
5535+
5536+
# The amount of time the client will wait for a servers response headers.
5537+
# CLI flag: -overrides.s3.http.response-header-timeout
5538+
[response_header_timeout: <duration> | default = 2m]
5539+
5540+
# Maximum time to wait for a TLS handshake. 0 means no limit.
5541+
# CLI flag: -overrides.s3.tls-handshake-timeout
5542+
[tls_handshake_timeout: <duration> | default = 10s]
5543+
5544+
# The time to wait for a server's first response headers after fully writing
5545+
# the request headers if the request has an Expect header. 0 to send the
5546+
# request body immediately.
5547+
# CLI flag: -overrides.s3.expect-continue-timeout
5548+
[expect_continue_timeout: <duration> | default = 1s]
5549+
5550+
# Maximum number of idle connections across all hosts. 0 means no limit.
5551+
# CLI flag: -overrides.s3.max-idle-conns
5552+
[max_idle_conns: <int> | default = 100]
5553+
5554+
# Maximum number of idle connections per host. 0 means no limit.
5555+
# CLI flag: -overrides.s3.max-idle-conns-per-host
5556+
[max_idle_conns_per_host: <int> | default = 100]
5557+
5558+
gcs:
5559+
# GCS bucket name
5560+
# CLI flag: -overrides.gcs.bucket-name
5561+
[bucket_name: <string> | default = ""]
5562+
5563+
# JSON either from a file or inline.
5564+
# CLI flag: -overrides.gcs.service-account
5565+
[service_account: <string> | default = ""]
5566+
5567+
azure:
5568+
# Azure storage account name
5569+
# CLI flag: -overrides.azure.account-name
5570+
[account_name: <string> | default = ""]
5571+
5572+
# Azure storage account key
5573+
# CLI flag: -overrides.azure.account-key
5574+
[account_key: <string> | default = ""]
5575+
5576+
# Azure storage container name
5577+
# CLI flag: -overrides.azure.container-name
5578+
[container_name: <string> | default = ""]
5579+
5580+
# Azure storage endpoint suffix without schema. The account name will be
5581+
# prefixed to this value to create the FQDN. If set to empty string, default
5582+
# endpoint suffix will be used.
5583+
# CLI flag: -overrides.azure.endpoint-suffix
5584+
[endpoint_suffix: <string> | default = ""]
5585+
5586+
# Azure storage max retry attempts
5587+
# CLI flag: -overrides.azure.max-retries
5588+
[max_retries: <int> | default = 20]
5589+
5590+
# Azure storage user domain
5591+
# CLI flag: -overrides.azure.user-domain
5592+
[user_domain: <string> | default = ""]
5593+
5594+
# Azure storage tenant ID
5595+
# CLI flag: -overrides.azure.tenant-id
5596+
[tenant_id: <string> | default = ""]
5597+
5598+
# Azure storage client ID
5599+
# CLI flag: -overrides.azure.client-id
5600+
[client_id: <string> | default = ""]
5601+
5602+
# Azure storage client secret
5603+
# CLI flag: -overrides.azure.client-secret
5604+
[client_secret: <string> | default = ""]
5605+
5606+
# Azure storage subscription ID
5607+
# CLI flag: -overrides.azure.subscription-id
5608+
[subscription_id: <string> | default = ""]
5609+
5610+
# Azure storage environment
5611+
# CLI flag: -overrides.azure.environment
5612+
[environment: <string> | default = "AzurePublicCloud"]
5613+
5614+
# Azure storage max retry attempts
5615+
# CLI flag: -overrides.azure.max-retries
5616+
[max_retries: <int> | default = 20]
5617+
5618+
# The time an idle connection will remain idle before closing.
5619+
# CLI flag: -overrides.azure.idle-conn-timeout
5620+
[idle_conn_timeout: <duration> | default = 1m30s]
5621+
5622+
# The amount of time the client will wait for a servers response headers.
5623+
# CLI flag: -overrides.azure.response-header-timeout
5624+
[response_header_timeout: <duration> | default = 2m]
5625+
5626+
# Maximum time to wait for a TLS handshake. 0 means no limit.
5627+
# CLI flag: -overrides.azure.tls-handshake-timeout
5628+
[tls_handshake_timeout: <duration> | default = 10s]
5629+
5630+
# The time to wait for a server's first response headers after fully writing
5631+
# the request headers if the request has an Expect header. 0 to send the
5632+
# request body immediately.
5633+
# CLI flag: -overrides.azure.expect-continue-timeout
5634+
[expect_continue_timeout: <duration> | default = 1s]
5635+
5636+
# Maximum number of idle connections across all hosts. 0 means no limit.
5637+
# CLI flag: -overrides.azure.max-idle-conns
5638+
[max_idle_conns: <int> | default = 100]
5639+
5640+
# Maximum number of idle connections per host. 0 means no limit.
5641+
# CLI flag: -overrides.azure.max-idle-conns-per-host
5642+
[max_idle_conns_per_host: <int> | default = 100]
5643+
5644+
swift:
5645+
# OpenStack Swift authentication API version. 0 to autodetect.
5646+
# CLI flag: -overrides.swift.auth-version
5647+
[auth_version: <int> | default = 0]
5648+
5649+
# OpenStack Swift authentication URL
5650+
# CLI flag: -overrides.swift.auth-url
5651+
[auth_url: <string> | default = ""]
5652+
5653+
# OpenStack Swift username
5654+
# CLI flag: -overrides.swift.username
5655+
[username: <string> | default = ""]
5656+
5657+
# OpenStack Swift user's domain name
5658+
# CLI flag: -overrides.swift.user-domain-name
5659+
[user_domain_name: <string> | default = ""]
5660+
5661+
# OpenStack Swift user's domain ID
5662+
# CLI flag: -overrides.swift.user-domain-id
5663+
[user_domain_id: <string> | default = ""]
5664+
5665+
# OpenStack Swift user ID
5666+
# CLI flag: -overrides.swift.user-id
5667+
[user_id: <string> | default = ""]
5668+
5669+
# OpenStack Swift user's password
5670+
# CLI flag: -overrides.swift.password
5671+
[password: <string> | default = ""]
5672+
5673+
# OpenStack Swift user's domain ID
5674+
# CLI flag: -overrides.swift.domain-id
5675+
[domain_id: <string> | default = ""]
5676+
5677+
# OpenStack Swift domain name
5678+
# CLI flag: -overrides.swift.domain-name
5679+
[domain_name: <string> | default = ""]
5680+
5681+
# OpenStack Swift project ID
5682+
# CLI flag: -overrides.swift.project-id
5683+
[project_id: <string> | default = ""]
5684+
5685+
# OpenStack Swift project name
5686+
# CLI flag: -overrides.swift.project-name
5687+
[project_name: <string> | default = ""]
5688+
5689+
# OpenStack Swift project domain ID
5690+
# CLI flag: -overrides.swift.project-domain-id
5691+
[project_domain_id: <string> | default = ""]
5692+
5693+
# OpenStack Swift project domain name
5694+
# CLI flag: -overrides.swift.project-domain-name
5695+
[project_domain_name: <string> | default = ""]
5696+
5697+
# OpenStack Swift region name
5698+
# CLI flag: -overrides.swift.region-name
5699+
[region_name: <string> | default = ""]
5700+
5701+
# OpenStack Swift container name
5702+
# CLI flag: -overrides.swift.container-name
5703+
[container_name: <string> | default = ""]
5704+
5705+
# OpenStack Swift max retry attempts
5706+
# CLI flag: -overrides.swift.max-retries
5707+
[max_retries: <int> | default = 3]
5708+
5709+
# OpenStack Swift connect timeout
5710+
# CLI flag: -overrides.swift.connect-timeout
5711+
[connect_timeout: <duration> | default = 10s]
5712+
5713+
# OpenStack Swift request timeout
5714+
# CLI flag: -overrides.swift.request-timeout
5715+
[request_timeout: <duration> | default = 5s]
5716+
```
5717+
```
5718+
54685719
### `ruler_storage_config`
54695720

54705721
The `ruler_storage_config` configures the Cortex ruler storage backend.

0 commit comments

Comments
 (0)