Skip to content

Commit

Permalink
Add scrape_samples_scraped metric to telemetry when debug mode is ena…
Browse files Browse the repository at this point in the history
…bled (#1055)

[comment]: # (Note that your PR title should follow the conventional
commit format: https://conventionalcommits.org/en/v1.0.0/#summary)
# PR Description
Add scrape_samples_scraped metric to telemetry when debug mode is
enabled
[comment]: # (The below checklist is for PRs adding new features. If a
box is not checked, add a reason why it's not needed.)
# New Feature Checklist

- [ ] List telemetry added about the feature.
- [ ] Link to the one-pager about the feature.
- [ ] List any tasks necessary for release (3P docs, AKS RP chart
changes, etc.) after merging the PR.
- [ ] Attach results of scale and perf testing.

**This is to add telemetry and it will go out with the next AKS RP
release.**

[comment]: # (The below checklist is for code changes. Not all boxes
necessarily need to be checked. Build, doc, and template changes do not
need to fill out the checklist.)
# Tests Checklist

- [ ] Have end-to-end Ginkgo tests been run on your cluster and passed?
To bootstrap your cluster to run the tests, follow [these
instructions](/otelcollector/test/README.md#bootstrap-a-dev-cluster-to-run-ginkgo-tests).
  - Labels used when running the tests on your cluster:
    - [ ] `operator`
    - [ ] `windows`
    - [ ] `arm64`
    - [ ] `arc-extension`
    - [ ] `fips`
- [ ] Have new tests been added? For features, have tests been added for
this feature? For fixes, is there a test that could have caught this
issue and could validate that the fix works?
  - [ ] Is a new scrape job needed?
- [ ] The scrape job was added to the folder
[test-cluster-yamls](/otelcollector/test/test-cluster-yamls/) in the
correct configmap or as a CR.
  - [ ] Was a new test label added?
- [ ] A string constant for the label was added to
[constants.go](/otelcollector/test/utils/constants.go).
- [ ] The label and description was added to the [test
README](/otelcollector/test/README.md).
- [ ] The label was added to this [PR
checklist](/.github/pull_request_template).
- [ ] The label was added as needed to
[testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml).
  - [ ] Are additional API server permissions needed for the new tests?
- [ ] These permissions have been added to
[api-server-permissions.yaml](/otelcollector/test/testkube/api-server-permissions.yaml).
  - [ ] Was a new test suite (a new folder under `/tests`) added?
- [ ] The new test suite is included in
[testkube-test-crs.yaml](/otelcollector/test/testkube/testkube-test-crs.yaml).

---------

Co-authored-by: Grace Wehner <[email protected]>
Co-authored-by: Grace Wehner <[email protected]>
  • Loading branch information
3 people authored Feb 11, 2025
1 parent 98b5516 commit 4900ba6
Show file tree
Hide file tree
Showing 18 changed files with 243 additions and 23 deletions.
13 changes: 12 additions & 1 deletion RELEASENOTES.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,20 @@
# Azure Monitor Metrics for AKS clusters
## Release 01-16-2025
## Release TBD
* Linux image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:<tbd>`
* Windows image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:<tbd>-win`
* TA image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:<tbd>-targetallocator`
* cfg sidecar image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:<tbd>-cfg`
* AKS and Arc Container Images:
- Add scrape_samples_scraped metric to telemetry when debug mode is enabled (https://github.com/Azure/prometheus-collector/pull/1055)
* Arc Extension Chart:

* Pipeline/Docs/Templates Updates:

## Release 01-16-2025
* Linux image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:6.14.0-main-01-16-2025-8d52acfe`
* Windows image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:6.14.0-main-01-16-2025-8d52acfe-win`
* TA image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:6.14.0-main-01-16-2025-8d52acfe-targetallocator`
* cfg sidecar image - `mcr.microsoft.com/azuremonitor/containerinsights/ciprod/prometheus-collector/images:6.14.0-main-01-16-2025-8d52acfe-cfg`
* AKS and Arc Container Images:
* Add support for global settings (https://github.com/Azure/prometheus-collector/pull/1003)
* Sign flagged binaries for windows containers (https://github.com/Azure/prometheus-collector/pull/1001)
Expand Down
5 changes: 5 additions & 0 deletions otelcollector/configuration-reader-builder/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ type OtelConfig struct {
Processors interface{} `yaml:"processors"`
Receivers interface{} `yaml:"receivers"`
} `yaml:"metrics"`
MetricsTelemetry struct {
Exporters interface{} `yaml:"exporters,omitempty"`
Processors interface{} `yaml:"processors,omitempty"`
Receivers interface{} `yaml:"receivers,omitempty"`
} `yaml:"metrics/telemetry,omitempty"`
} `yaml:"pipelines"`
Telemetry struct {
Logs struct {
Expand Down
12 changes: 12 additions & 0 deletions otelcollector/fluent-bit/fluent-bit-daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,18 @@ pipeline:
- name: labels
delete: transport

- name: prometheus_scrape
host: 127.0.0.1
port: 9095
tag: prometheus.metrics.volume
metrics_path: /metrics
scrape_interval: 1m
processors:
metrics:
- name: metrics_selector
metric_name: /scrape_samples_post_metric_relabeling/
action: include

filters:
- name: rewrite_tag
match: prometheus.metricsextension
Expand Down
11 changes: 11 additions & 0 deletions otelcollector/fluent-bit/fluent-bit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,17 @@ pipeline:
metric_name: /opentelemetry_allocator_targets|opentelemetry_allocator_collectors_discovered/
action: include

- name: prometheus_scrape
host: 127.0.0.1
port: 9095
tag: prometheus.metrics.volume
metrics_path: /metrics
scrape_interval: 1m
processors:
metrics:
- name: metrics_selector
metric_name: /scrape_samples_post_metric_relabeling/
action: include

filters:
- name: rewrite_tag
Expand Down
35 changes: 35 additions & 0 deletions otelcollector/fluent-bit/src/cmetrics_decoder.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,11 +146,45 @@ func (cm CMetrics) String() string {
return ret.String()
}

func SendPrometheusMetricsToAppInsightsAfterAgg(records []map[interface{}]interface{}, aggregationLabel string, telemetryPrefix string, metricName string) {
aggregatedValueByLabel := make(map[string]float64)

for _, record := range records {
cMetrics := ConvertRecordToCMetrics(record)
for _, metric := range cMetrics.Metrics {
for _, value := range metric.Values {
for i, labelName := range metric.Meta.Labels {
if strings.ToLower(labelName) == aggregationLabel {
if _, ok := aggregatedValueByLabel[value.Labels[i]]; !ok {
aggregatedValueByLabel[value.Labels[i]] = value.Value
} else {
aggregatedValueByLabel[value.Labels[i]] += value.Value
}
}
}
}
}
for labelName, value := range aggregatedValueByLabel {
metricTelemetryItem := appinsights.NewMetricTelemetry(
fmt.Sprintf("%s_%s", telemetryPrefix, metricName),
value,
)
metricTelemetryItem.Properties[aggregationLabel] = fmt.Sprintf("%s", labelName)
TelemetryClient.Track(metricTelemetryItem)
Log(fmt.Sprintf("Sent telemetry for %s_%s", telemetryPrefix, metricName))
}
}
}

func SendPrometheusMetricsToAppInsights(records []map[interface{}]interface{}, tag string) int {
telemetryPrefix := "prometheus"
if tag == "prometheus.metrics.targetallocator" {
telemetryPrefix = "target_allocator"
}

if tag == "prometheus.metrics.volume" {
SendPrometheusMetricsToAppInsightsAfterAgg(records, "job", telemetryPrefix, "scrape_samples_post_metric_relabeling")
} else {
for _, record := range records {
cMetrics := ConvertRecordToCMetrics(record)
for _, metric := range cMetrics.Metrics {
Expand All @@ -167,6 +201,7 @@ func SendPrometheusMetricsToAppInsights(records []map[interface{}]interface{}, t
}
}
}
}
return output.FLB_OK
}

Expand Down
2 changes: 1 addition & 1 deletion otelcollector/fluent-bit/src/out_appinsights.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ func FLBPluginFlush(data unsafe.Pointer, length C.int, tag *C.char) int {
case fluentbitExportingFailedTag:
return RecordExportingFailed(records)
// Prometheus metrics from otelcollector, Prometheus UX, and targetallocator
case "prometheus.metrics.otelcollector", "prometheus.metrics.prometheus", "prometheus.metrics.targetallocator":
case "prometheus.metrics.otelcollector", "prometheus.metrics.prometheus", "prometheus.metrics.targetallocator", "prometheus.metrics.volume":
return SendPrometheusMetricsToAppInsights(records, incomingTag)
default:
// Error messages from metrics extension and otelcollector
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ exporters:
endpoint: "127.0.0.1:9091"
const_labels:
cluster: ${env:AZMON_CLUSTER_LABEL}
prometheus/telemetry:
endpoint: "127.0.0.1:9095"
otlp:
endpoint: 127.0.0.1:55680
tls:
Expand All @@ -27,6 +29,10 @@ processors:
- key: instance
from_attribute: service.instance.id
action: insert
filter/telemetry:
metrics:
metric:
- 'name != "scrape_samples_post_metric_relabeling"'
receivers:
prometheus:
config:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ exporters:
endpoint: "127.0.0.1:9091"
const_labels:
cluster: ${env:AZMON_CLUSTER_LABEL}
prometheus/telemetry:
endpoint: "127.0.0.1:9095"
otlp:
endpoint: 127.0.0.1:55680
tls:
Expand All @@ -27,6 +29,10 @@ processors:
- key: instance
from_attribute: service.instance.id
action: insert
filter/telemetry:
metrics:
metric:
- 'name != "scrape_samples_post_metric_relabeling"'
receivers:
prometheus:
target_allocator:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ exporters:
endpoint: "127.0.0.1:9091"
const_labels:
cluster: ${env:AZMON_CLUSTER_LABEL}
prometheus/telemetry:
endpoint: "127.0.0.1:9095"
otlp:
endpoint: 127.0.0.1:55680
tls:
Expand All @@ -27,6 +29,10 @@ processors:
- key: instance
from_attribute: service.instance.id
action: insert
filter/telemetry:
metrics:
metric:
- 'name != "scrape_samples_post_metric_relabeling"'
receivers:
prometheus:
config:
Expand Down
3 changes: 3 additions & 0 deletions otelcollector/opentelemetry-collector-builder/components.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package main

import (
"github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/filterprocessor"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/resourceprocessor"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver"
"go.opentelemetry.io/collector/exporter"
Expand All @@ -10,6 +11,7 @@ import (
"go.opentelemetry.io/collector/otelcol"
"go.opentelemetry.io/collector/processor"
"go.opentelemetry.io/collector/processor/batchprocessor"

"go.opentelemetry.io/collector/receiver"
)

Expand Down Expand Up @@ -40,6 +42,7 @@ func components() (otelcol.Factories, error) {
factories.Processors, err = processor.MakeFactoryMap(
batchprocessor.NewFactory(),
resourceprocessor.NewFactory(),
filterprocessor.NewFactory(),
)
if err != nil {
return otelcol.Factories{}, err
Expand Down
15 changes: 15 additions & 0 deletions otelcollector/opentelemetry-collector-builder/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ replace github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prome

require (
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter v0.116.0
github.com/open-telemetry/opentelemetry-collector-contrib/processor/filterprocessor v0.116.0
github.com/open-telemetry/opentelemetry-collector-contrib/processor/resourceprocessor v0.116.0
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver v0.116.0
go.opentelemetry.io/collector/component v0.116.0
Expand Down Expand Up @@ -33,7 +34,10 @@ require (
github.com/AzureAD/microsoft-authentication-library-for-go v1.2.2 // indirect
github.com/Code-Hex/go-generics-cache v1.5.1 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/alecthomas/participle/v2 v2.1.1 // indirect
github.com/alecthomas/units v0.0.0-20240626203959-61d1e3462e30 // indirect
github.com/antchfx/xmlquery v1.4.2 // indirect
github.com/antchfx/xpath v1.3.2 // indirect
github.com/armon/go-metrics v0.4.1 // indirect
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
github.com/aws/aws-sdk-go v1.54.19 // indirect
Expand All @@ -52,9 +56,12 @@ require (
github.com/docker/go-units v0.5.0 // indirect
github.com/ebitengine/purego v0.8.1 // indirect
github.com/edsrzf/mmap-go v1.1.0 // indirect
github.com/elastic/go-grok v0.3.1 // indirect
github.com/elastic/lunes v0.1.0 // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/envoyproxy/go-control-plane v0.13.0 // indirect
github.com/envoyproxy/protoc-gen-validate v1.1.0 // indirect
github.com/expr-lang/expr v1.16.9 // indirect
github.com/facette/natsort v0.0.0-20181210072756-2cd4dd1e2dcb // indirect
github.com/fatih/color v1.16.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
Expand All @@ -76,6 +83,8 @@ require (
github.com/go-resty/resty/v2 v2.13.1 // indirect
github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
github.com/go-zookeeper/zk v1.0.3 // indirect
github.com/gobwas/glob v0.2.3 // indirect
github.com/goccy/go-json v0.10.4 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang-jwt/jwt/v5 v5.2.1 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
Expand Down Expand Up @@ -104,9 +113,11 @@ require (
github.com/hashicorp/go-rootcerts v1.0.2 // indirect
github.com/hashicorp/go-version v1.7.0 // indirect
github.com/hashicorp/golang-lru v1.0.2 // indirect
github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect
github.com/hashicorp/nomad/api v0.0.0-20240717122358-3d93bd3778f3 // indirect
github.com/hashicorp/serf v0.10.1 // indirect
github.com/hetznercloud/hcloud-go/v2 v2.10.2 // indirect
github.com/iancoleman/strcase v0.3.0 // indirect
github.com/imdario/mergo v0.3.16 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/ionos-cloud/sdk-go/v6 v6.1.11 // indirect
Expand All @@ -123,6 +134,7 @@ require (
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/linode/linodego v1.37.0 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/magefile/mage v1.15.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.20 // indirect
Expand All @@ -139,6 +151,8 @@ require (
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f // indirect
github.com/oklog/ulid v1.3.1 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/coreinternal v0.116.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/filter v0.116.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl v0.116.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil v0.116.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/resourcetotelemetry v0.116.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/prometheus v0.116.0 // indirect
Expand Down Expand Up @@ -169,6 +183,7 @@ require (
github.com/stretchr/testify v1.10.0 // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/ua-parser/uap-go v0.0.0-20240611065828-3a4781585db6 // indirect
github.com/vultr/govultr/v2 v2.17.2 // indirect
github.com/yusufpapurcu/wmi v1.2.4 // indirect
go.mongodb.org/mongo-driver v1.14.0 // indirect
Expand Down
Loading

0 comments on commit 4900ba6

Please sign in to comment.