Fix observability stack #239
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What was changed
✅ Loki 3.0 is running without errors and properly configured
✅ Prometheus successfully scrapes metrics from all Temporal services
✅ Grafana 12.2.1 is deployed with proper datasource configuration
✅ All 4 dashboards are provisioned and display Temporal metrics correctly
✅ Consistent YAML formatting across all provisioning files (camelCase field names)
✅ No compatibility or configuration errors in logs
Why?
Because of the breaking changes in Loki 3.0 mentioned in this issue. I've started with fixing the Loki configuration but it didn't helped and I still haven't seen any dashboards in Grafana.
Checklist
Closes 215
How was this tested:
docker compose -f docker-compose-multirole.yaml upand then go to Grafana to ensure dashboards are workingAny docs updates needed?
No
Detailed description of changes made and their explanations
1. Loki 3.0 Configuration Update
File:
deployment/loki/local-config.yamlChanges Made
1.1 Updated Schema Version (v9 → v13)
v9tov131.2 Changed Index Store (boltdb → tsdb)
store: boltdbtostore: tsdbin schema_config1.3 Updated Storage Configuration
boltdbstorage section withdirectory: /tmp/loki/indextsdb_shipperconfiguration with:active_index_directory: /tmp/loki/tsdbcache_location: /tmp/loki/index_cachecache_ttl: 24h1.4 Optimized Index Period
168hto24h1.5 Added Compactor Configuration
1.6 Removed Deprecated Fields
enforce_metric_namefromlimits_configchunk_store_configsection (withmax_look_back_period)table_managersection (only used with DynamoDB)2. Prometheus Configuration Fix
File:
deployment/prometheus/config.ymlChanges Made
2.1 Updated Scrape Targets from host.docker.internal to Container Names
host.docker.internal(which points to the host machine) was incorrect.temporalmetricsjob:host.docker.internal:8000→temporal-history:8000host.docker.internal:8001→temporal-matching:8001host.docker.internal:8002→temporal-frontend:8002host.docker.internal:8003→temporal-worker:8003host.docker.internal:8004→temporal-frontend2:8004Result: Prometheus can now successfully scrape metrics from the Temporal services, and dashboards have access to the required metrics (service_requests, service_errors, etc.)
3. Grafana Datasource Configuration Fix
File:
deployment/grafana/provisioning/datasources/all.ymlChanges Made
3.1 Fixed isDefault Field Format
isDefault: true(camelCase) notis_default: true(snake_case)is_default: truetoisDefault: true3.2 Standardized Field Naming (org_id → orgId)
org_id: 1toorgId: 1for both Prometheus and Loki datasourcesisDefault: falseto Loki datasource for clarityResult: Prometheus is now properly set as the default datasource, allowing dashboards with
$datasourcevariable to work correctly. Both provisioning files now use consistent camelCase field naming.4. Grafana Dashboard Provisioning Update
File:
deployment/grafana/provisioning/dashboards/all.ymlChanges Made
4.1 Updated to New Dashboard Provisioning Format
apiVersionandprovidersblockapiVersion: 1at the topproviders:blockfolderproperty (replaced withpath)org_idtoorgIddisableDeletion: falseandeditable: trueResult: Eliminates deprecation warnings and ensures proper dashboard provisioning in Grafana 12.2.1
5. Grafana Version Upgrade
File:
docker-compose-multirole.yamlChanges Made
5.1 Updated Grafana Image Version
grafana/grafana:7.5.16tografana/grafana:12.2.1Result: Dashboards now load without compatibility errors and display data correctly
6. Grafana Volume Mounts Configuration
File:
docker-compose-multirole.yamlChanges Made
6.1 Added Missing Volume Mounts
Result: All 4 dashboards are now properly provisioned and visible in Grafana: