Skip to content

support metrics v3 in fakeintake#51150

Open
atanzu wants to merge 5 commits into
mainfrom
mark.kirichenko/support-v3-in-fakeintake
Open

support metrics v3 in fakeintake#51150
atanzu wants to merge 5 commits into
mainfrom
mark.kirichenko/support-v3-in-fakeintake

Conversation

@atanzu
Copy link
Copy Markdown
Contributor

@atanzu atanzu commented May 21, 2026

What does this PR do?

Allow us to test V3 payloads with fakeintake.

Motivation

Current e2e test setup doesn't support the V3 payload format. We want to address this gap.

We add V3 metrics intake support, which relies on the V3 parser from the intake. We cannot import this module directly, so we use a copy of it.

Describe how you validated your changes

We add new e2e tests for our usual metrics and for OTel data which verify that we can send data to a V3 endpoint instead of V2.

Additional Notes

V3 payload parser is a copy from the intake, so we need to update both in parallel in case of any changes.

@github-actions
Copy link
Copy Markdown
Contributor

@codex review

@github-actions github-actions Bot added the long review PR is complex, plan time to review it label May 21, 2026
@datadog-prod-us1-5
Copy link
Copy Markdown
Contributor

datadog-prod-us1-5 Bot commented May 21, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 2 Pipeline jobs failed

DataDog/datadog-agent | new-e2e-ha-agent-failover   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. 2 failed tests in TestHAAgentFailoverSuite. Expected agent status to be 'active' but got 'standby'. Connection refused errors for metadata and status calls.

DataDog/datadog-agent | new-e2e-cws-windows   View in Datadog   GitLab

See error Test failures due to unexpected errors: 'no log found' and 'condition never satisfied' during TestAgentWindowsSuite.

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 10.28%
Overall Coverage: 50.45% (-0.04%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 4f0c860 | Docs | Datadog PR Page | Give us feedback!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69873fff60

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

"/api/v1/check_run": getCheckRunPayLoadJSON,
"/api/v1/connections": getConnectionsPayLoadProtobuf,
"/api/beta/sketches": getSketchPayloadProtobuf,
"/api/intake/metrics/v3/series": getMetricV3SeriesPayload,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Register V3 sketches route alongside V3 series

This change enables serializer_experimental_use_v3_api.sketches.endpoints (host and Helm paths), so when V3 is turned on the agent sends sketch payloads to /api/intake/metrics/v3/sketches; however fakeintake only adds a parser for /api/intake/metrics/v3/series here, and FilterSketches() still reads /api/beta/sketches. In V3 mode, distribution/sketch assertions will therefore silently see no data even though payloads are being sent, which breaks end-to-end metric coverage for sketches.

Useful? React with 👍 / 👎.

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 21, 2026

Files inventory check summary

File checks results against ancestor faf049aa:

Results for datadog-agent_7.81.0~devel.git.368.4f0c860.pipeline.116250035-1_amd64.deb:

No change detected

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from 69873ff to 46d4a45 Compare May 21, 2026 11:35
@github-actions
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 46d4a4504b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

"/api/v1/check_run": getCheckRunPayLoadJSON,
"/api/v1/connections": getConnectionsPayLoadProtobuf,
"/api/beta/sketches": getSketchPayloadProtobuf,
"/api/intake/metrics/v3/series": getMetricV3SeriesPayload,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse V3 sketches when V3 metrics mode is enabled

This map adds /api/intake/metrics/v3/series but leaves out /api/intake/metrics/v3/sketches, even though WithV3MetricsEnabled now configures both series and sketches V3 endpoints. In that configuration, sketch payloads are sent to the V3 sketches route (comp/forwarder/defaultforwarder/endpoints/endpoints.go) and fakeintake won’t decode them (and FilterSketches still reads /api/beta/sketches), so sketch-based assertions under V3 mode silently miss data. Either add V3-sketch parsing/client support or stop enabling the sketches V3 endpoint until that support exists.

Useful? React with 👍 / 👎.

@atanzu atanzu changed the title Mark.kirichenko/support v3 in fakeintake support v3 in fakeintake May 21, 2026
@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from 46d4a45 to b2b4d07 Compare May 21, 2026 12:17
@github-actions
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b2b4d07492

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

"/api/v1/check_run": getCheckRunPayLoadJSON,
"/api/v1/connections": getConnectionsPayLoadProtobuf,
"/api/beta/sketches": getSketchPayloadProtobuf,
"/api/intake/metrics/v3/series": getMetricV3SeriesPayload,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add V3 sketches route parsing to fakeintake

When V3 is enabled, the new config paths set both series and sketches V3 endpoints to fakeintake, so sketch payloads are sent to /api/intake/metrics/v3/sketches. This parser map only adds /api/intake/metrics/v3/series, so V3 sketch payloads remain unparsed and FilterSketches() (which still reads /api/beta/sketches) will silently miss them. Any V3-enabled tests asserting distributions/sketches can therefore fail or give false negatives.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we have tests using sketches but I think it's worth a check?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @vickenty :

We can leave sketches out entirely from the config, or separate them from series so we can enable it without sketches.

But within the scope of this PR I believe it is OK.

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from b2b4d07 to 8995a94 Compare May 21, 2026 13:32
@github-actions
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8995a94b17

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +367 to +369
sketches:
endpoints:
- %[3]s://%[1]s:%[2]d
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop rerouting sketches until fakeintake can read v3 sketches

This option now enables serializer_experimental_use_v3_api.sketches.endpoints, which redirects sketch/distribution traffic to /api/intake/metrics/v3/sketches, but fakeintake still has no corresponding v3-sketch retrieval path (FilterSketches still queries /api/beta/sketches, and there is no /api/intake/metrics/v3/sketches handler in serverstore/parser.go). As soon as a v3-enabled test needs to assert sketch metrics, it will get empty results despite sketches being sent.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave sketches out entirely from the config, or separate them from series so we can enable it without sketches.

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from 8995a94 to ca9624c Compare May 21, 2026 14:48
@github-actions
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca9624c4fe

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +367 to +369
sketches:
endpoints:
- %[3]s://%[1]s:%[2]d
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don’t enable V3 sketches without fakeintake sketch support

This now configures serializer_experimental_use_v3_api.sketches.endpoints, so sketch/distribution payloads are sent to /api/intake/metrics/v3/sketches, but fakeintake still only exposes sketch parsing/querying on /api/beta/sketches (FilterSketches/GetSketchNames in test/fakeintake/client/client.go). In suites that enable V3 and assert on sketch metrics, those assertions will incorrectly see no data because sketches have been rerouted to an unsupported endpoint.

Useful? React with 👍 / 👎.

@atanzu atanzu added the changelog/no-changelog No changelog entry needed label May 22, 2026
@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from ca9624c to 3655bb2 Compare May 22, 2026 06:23
@github-actions
Copy link
Copy Markdown
Contributor

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3655bb2f3e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +367 to +369
sketches:
endpoints:
- %[3]s://%[1]s:%[2]d
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid pointing sketches to unsupported V3 fakeintake route

WithV3MetricsEnabled also rewrites sketch traffic to V3, but this change set only adds fakeintake support for /api/intake/metrics/v3/series (new parser/client path) and adds no parser/client support for /api/intake/metrics/v3/sketches. In E2E runs that emit distribution/sketch metrics with V3 enabled, sketch payloads are sent to an endpoint fakeintake cannot decode/query, so FilterSketches-style assertions will silently see no data and tests can miss or misreport regressions.

Useful? React with 👍 / 👎.

@atanzu atanzu added the qa/done QA done before merge and regressions are covered by tests label May 22, 2026
@atanzu atanzu marked this pull request as ready for review May 22, 2026 10:06
@atanzu atanzu requested review from a team as code owners May 22, 2026 10:06
@atanzu atanzu requested review from a team as code owners May 22, 2026 10:06
@atanzu atanzu requested review from agagniere and removed request for a team May 22, 2026 10:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3655bb2f3e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}
envVars := pulumi.Map{
"DD_SERIALIZER_EXPERIMENTAL_USE_V3_API_SERIES_ENDPOINTS": pulumi.Sprintf("%s", p.intakeURL),
"DD_SERIALIZER_EXPERIMENTAL_USE_V3_API_SKETCHES_ENDPOINTS": pulumi.Sprintf("%s", p.intakeURL),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid rerouting sketches until fakeintake parses V3 sketches

WithV3MetricsEnabled now sets DD_SERIALIZER_EXPERIMENTAL_USE_V3_API_SKETCHES_ENDPOINTS, which moves distribution/sketch traffic to /api/intake/metrics/v3/sketches, but this change only adds V3 handling for /api/intake/metrics/v3/series (test/fakeintake/server/serverstore/parser.go) and the client still reads sketches from /api/beta/sketches (test/fakeintake/client/client.go). In V3-enabled E2E runs that assert sketch/distribution metrics, fakeintake will silently return no sketches even though the agent is sending them, causing false negatives.

Useful? React with 👍 / 👎.

@carlosroman carlosroman added this to the 7.81.0 milestone May 22, 2026
Comment on lines +367 to +369
sketches:
endpoints:
- %[3]s://%[1]s:%[2]d
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave sketches out entirely from the config, or separate them from series so we can enable it without sketches.

if p.intakeHostname == nil || p.intakePort == nil || p.intakeScheme == nil {
return fmt.Errorf("WithV3MetricsEnabled must be called after WithFakeintake or WithIntakeHostname")
}
v3Config := pulumi.Sprintf(`serializer_experimental_use_v3_api:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my personal preference, but I try to avoid plain text templating like this. I'd make a struct and serialize it with proper serializer, just to avoid potential issues with quoting strings or counting spaces.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to play with that in rev2 but not sure if it has become significantly better.

// Store so that WithV3MetricsEnabled (applied after) can build V3 endpoints config.
p.intakeScheme = scheme
p.intakeHostname = hostname
p.intakePort = port
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, we format the url twice. Would it make sense to construct it here, stash it and use the complete string in two places?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we save the URL, do we still need to store intakeScheme and friends?

return nil, fmt.Errorf("v3 next sketch point: %w", err)
}
}
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should return an error if a sketch is encountered in a series payload, it would be a serious bug in the agent.

case encodingDeflate:
rc, err = zlib.NewReader(bytes.NewReader(payload))
case encodingZstd:
rc = zstd.NewReader(bytes.NewReader(payload))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work with zstd with the latest head of DataDog/zstd (v1.5.8-0.20250922095318-5c504fb5d923).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, I don't see a stable 1.5.8 tag. Would that be OK to have that specific pinned version?

Copy link
Copy Markdown
Contributor

@vickenty vickenty May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow up on the release, but yes, it should be OK to pin pre-release version.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was released as 1.5.7+patch1, +patch is how we version our changes on top of the upstream.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

// MetricSeriesV3 represents a single time series decoded from a V3 metrics intake payload
// (/api/intake/metrics/v3/series). The V3 format is a column-oriented protobuf encoding
// described in pkg/proto/datadog/dogstatsdhttp/payload.proto.
type MetricSeriesV3 struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use the old metricAggregator for v3 as well? V3 is expected to be equivalent (at least until we turn on sketches), and should be transparent change. Ideally, the existing tests that currently use v2 protocol should upgrade when we enable v3 by default (and it would be nice if we can run the same suite in parallel for both versions). WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial intent was to make this check as explicit as possible. You are right in terms of usage of a single metricAggregator and its transparency: after the flip the modified fakeintake shall work as is. However, for now we still need to have a clear vision between v2 and v3 (and check that we don't have mixed traffic), so I will update a couple of the existing test suites to have a v3 variant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, with some intermediate glue we can process both v2 and v3 transparently. Still, I've added explicit test cases to know exactly which protocol do we use.

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from 3655bb2 to 07cb2cd Compare May 27, 2026 06:32
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 27, 2026

Static quality checks

✅ Please find below the results from static quality gates
Comparison made with ancestor faf049a
📊 Static Quality Gates Dashboard
🔗 SQG Job

Successful checks

Info

Quality gate Change Size (prev → curr → max)
agent_deb_amd64_fips +16.16 KiB (0.00% increase, -8.23% of buffer) 704.108 → 704.124 → 704.300
agent_msi +7.5 KiB (0.00% increase, -0.05% of buffer) 610.430 → 610.438 → 624.040
agent_rpm_amd64_fips +16.16 KiB (0.00% increase, -7.59% of buffer) 704.092 → 704.108 → 704.300
agent_rpm_arm64 +8.16 KiB (0.00% increase, -1.65% of buffer) 724.018 → 724.026 → 724.500
agent_rpm_arm64_fips +4.12 KiB (0.00% increase, -4.28% of buffer) 684.776 → 684.780 → 684.870
agent_suse_amd64_fips +16.16 KiB (0.00% increase, -7.59% of buffer) 704.092 → 704.108 → 704.300
agent_suse_arm64 +8.16 KiB (0.00% increase, -1.65% of buffer) 724.018 → 724.026 → 724.500
agent_suse_arm64_fips +4.12 KiB (0.00% increase, -4.28% of buffer) 684.776 → 684.780 → 684.870
docker_agent_arm64 +8.16 KiB (0.00% increase, -0.71% of buffer) 808.998 → 809.006 → 810.120
docker_agent_jmx_arm64 +8.16 KiB (0.00% increase, -0.66% of buffer) 988.591 → 988.599 → 989.800
docker_dogstatsd_amd64 +4.03 KiB (0.01% increase, -8.75% of buffer) 39.515 → 39.519 → 39.560
docker_dogstatsd_arm64 +64.03 KiB (0.17% increase, -16.04% of buffer) 37.690 → 37.753 → 38.080
docker_host_profiler_arm64 +64.3 KiB (0.02% increase, -0.45% of buffer) 313.597 → 313.660 → 327.470
iot_agent_deb_arm64 +4.03 KiB (0.01% increase, -0.29% of buffer) 41.437 → 41.441 → 42.800
iot_agent_deb_armhf +4.03 KiB (0.01% increase, -0.49% of buffer) 42.150 → 42.154 → 42.960
18 successful checks with minimal change (< 2 KiB)
Quality gate Current Size
agent_deb_amd64 746.461 MiB
agent_heroku_amd64 310.785 MiB
agent_rpm_amd64 746.445 MiB
agent_suse_amd64 746.445 MiB
docker_agent_amd64 806.596 MiB
docker_agent_jmx_amd64 997.537 MiB
docker_cluster_agent_amd64 207.244 MiB
docker_cluster_agent_arm64 221.206 MiB
docker_cws_instrumentation_amd64 7.154 MiB
docker_cws_instrumentation_arm64 6.689 MiB
docker_host_profiler_amd64 302.135 MiB
dogstatsd_deb_amd64 30.174 MiB
dogstatsd_deb_arm64 28.296 MiB
dogstatsd_rpm_amd64 30.174 MiB
dogstatsd_suse_amd64 30.174 MiB
iot_agent_deb_amd64 44.472 MiB
iot_agent_rpm_amd64 44.473 MiB
iot_agent_suse_amd64 44.472 MiB

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented May 27, 2026

Regression Detector

Regression Detector Results

Metrics dashboard
Target profiles
Run ID: 6315d6f7-225b-4304-92c7-516d68fb6dc9

Baseline: faf049a
Comparison: 4f0c860
Diff

Optimization Goals: ✅ No significant changes detected

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI trials links
docker_containers_cpu % cpu utilization +0.03 [-2.90, +2.97] 1 Logs

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI trials links
quality_gate_metrics_logs memory utilization +3.42 [+3.15, +3.69] 1 Logs bounds checks dashboard
otlp_ingest_metrics memory utilization +2.03 [+1.87, +2.19] 1 Logs
otlp_ingest_logs memory utilization +1.24 [+1.15, +1.34] 1 Logs
tcp_syslog_to_blackhole ingress throughput +0.65 [+0.46, +0.83] 1 Logs
ddot_metrics_sum_delta memory utilization +0.51 [+0.34, +0.69] 1 Logs
uds_dogstatsd_20mb_12k_contexts_20_senders memory utilization +0.50 [+0.45, +0.55] 1 Logs
ddot_metrics memory utilization +0.45 [+0.25, +0.65] 1 Logs
file_tree memory utilization +0.43 [+0.38, +0.48] 1 Logs
ddot_logs memory utilization +0.25 [+0.19, +0.31] 1 Logs
docker_containers_memory memory utilization +0.13 [+0.03, +0.23] 1 Logs
quality_gate_logs % cpu utilization +0.10 [-0.94, +1.13] 1 Logs bounds checks dashboard
quality_gate_idle memory utilization +0.07 [+0.02, +0.12] 1 Logs bounds checks dashboard
docker_containers_cpu % cpu utilization +0.03 [-2.90, +2.97] 1 Logs
file_to_blackhole_1000ms_latency egress throughput +0.02 [-0.42, +0.46] 1 Logs
uds_dogstatsd_to_api ingress throughput +0.01 [-0.20, +0.22] 1 Logs
tcp_dd_logs_filter_exclude ingress throughput +0.01 [-0.08, +0.10] 1 Logs
uds_dogstatsd_to_api_v3 ingress throughput -0.00 [-0.20, +0.20] 1 Logs
file_to_blackhole_500ms_latency egress throughput -0.01 [-0.42, +0.39] 1 Logs
file_to_blackhole_100ms_latency egress throughput -0.05 [-0.19, +0.10] 1 Logs
file_to_blackhole_0ms_latency egress throughput -0.06 [-0.54, +0.42] 1 Logs
ddot_metrics_sum_cumulativetodelta_exporter memory utilization -0.19 [-0.43, +0.05] 1 Logs
ddot_metrics_sum_cumulative memory utilization -0.23 [-0.39, -0.08] 1 Logs
quality_gate_idle_all_features memory utilization -0.34 [-0.37, -0.30] 1 Logs bounds checks dashboard

Bounds Checks: ✅ Passed

perf experiment bounds_check_name replicates_passed observed_value links
docker_containers_cpu simple_check_run 10/10 725 ≥ 26
docker_containers_memory memory_usage 10/10 246.29MiB ≤ 370MiB
docker_containers_memory simple_check_run 10/10 697 ≥ 26
file_to_blackhole_0ms_latency memory_usage 10/10 0.16GiB ≤ 1.20GiB
file_to_blackhole_0ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_1000ms_latency memory_usage 10/10 0.20GiB ≤ 1.20GiB
file_to_blackhole_1000ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_100ms_latency memory_usage 10/10 0.17GiB ≤ 1.20GiB
file_to_blackhole_100ms_latency missed_bytes 10/10 0B = 0B
file_to_blackhole_500ms_latency memory_usage 10/10 0.18GiB ≤ 1.20GiB
file_to_blackhole_500ms_latency missed_bytes 10/10 0B = 0B
quality_gate_idle intake_connections 10/10 3 ≤ 4 bounds checks dashboard
quality_gate_idle memory_usage 10/10 144.09MiB ≤ 147MiB bounds checks dashboard
quality_gate_idle total_bytes_received 10/10 740.76KiB ≤ 819.20KiB bounds checks dashboard
quality_gate_idle_all_features intake_connections 10/10 3 ≤ 4 bounds checks dashboard
quality_gate_idle_all_features memory_usage 10/10 475.09MiB ≤ 495MiB bounds checks dashboard
quality_gate_idle_all_features total_bytes_received 10/10 1.14MiB ≤ 1.25MiB bounds checks dashboard
quality_gate_logs intake_connections 10/10 4 ≤ 6 bounds checks dashboard
quality_gate_logs memory_usage 10/10 175.48MiB ≤ 195MiB bounds checks dashboard
quality_gate_logs missed_bytes 10/10 0B = 0B bounds checks dashboard
quality_gate_logs total_bytes_received 10/10 264.33MiB ≤ 292MiB bounds checks dashboard
quality_gate_metrics_logs cpu_usage 10/10 359.19 ≤ 2000 bounds checks dashboard
quality_gate_metrics_logs intake_connections 10/10 4 ≤ 6 bounds checks dashboard
quality_gate_metrics_logs memory_usage 10/10 418.49MiB ≤ 430MiB bounds checks dashboard
quality_gate_metrics_logs missed_bytes 10/10 0B = 0B bounds checks dashboard
quality_gate_metrics_logs total_bytes_received 10/10 0.93GiB ≤ 1.04GiB bounds checks dashboard

Explanation

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

CI Pass/Fail Decision

Passed. All Quality Gates passed.

  • quality_gate_metrics_logs, bounds check total_bytes_received: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check total_bytes_received: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
  • quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check total_bytes_received: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check memory_usage: 10/10 replicas passed. Gate passed.
  • quality_gate_idle, bounds check total_bytes_received: 10/10 replicas passed. Gate passed.

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch 2 times, most recently from 60b632e to 9b358c7 Compare May 27, 2026 08:28
@atanzu atanzu requested a review from vickenty May 27, 2026 11:23
@chouetz chouetz changed the title support v3 in fakeintake support yaml v3 in fakeintake May 29, 2026
@chouetz chouetz changed the title support yaml v3 in fakeintake support metrics v3 in fakeintake May 29, 2026
Copy link
Copy Markdown
Member

@chouetz chouetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might deserve a double check on the last codex comment, I assume it's not a blocker though.

"/api/v1/check_run": getCheckRunPayLoadJSON,
"/api/v1/connections": getConnectionsPayLoadProtobuf,
"/api/beta/sketches": getSketchPayloadProtobuf,
"/api/intake/metrics/v3/series": getMetricV3SeriesPayload,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we have tests using sketches but I think it's worth a check?

// Store so that WithV3MetricsEnabled (applied after) can build V3 endpoints config.
p.intakeScheme = scheme
p.intakeHostname = hostname
p.intakePort = port
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we save the URL, do we still need to store intakeScheme and friends?

// Only series are redirected; sketches V3 support is not yet implemented in fakeintake.
//
// Must be called after WithFakeintake (or WithIntakeHostname) so the intake URL is known.
func WithV3MetricsEnabled() func(*Params) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add this option to the list in the comment attached to Params?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make this part as explicit as possible.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What part? What I meant was to add this method the list of available options.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just list it there. Ack. I misunderstood your initial comment: I thought it was about enabling v3 in some other, more implicit way.

// Only series are redirected; sketches V3 support is not yet implemented in fakeintake.
//
// Must be called after WithFakeintake or WithIntake so the intake URL is known.
func WithV3MetricsEnabled() func(*Params) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, would it make sense to add this option to the list in the comment attached to Params?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

"slices"

agentpayload "github.com/DataDog/agent-payload/v5/gogen"
pb "github.com/DataDog/datadog-agent/pkg/proto/pbgo/dogstatsdhttp"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use this instead definition instead, that repo should contain compiled version as well. It includes units, so we won't need to parse them from bytes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice one, thanks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no compiled version for v3, but that's not a blocker.

// // Acessors for metric data point can be called.
// }
// }
type MetricDataReader struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Together with the schema, let's copy the implementation from the intake instead to use with the pb file from agent-payload. It covers top-level metadata, and it would make it easier to maintain in the future.

@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch 4 times, most recently from 5e9db01 to 434fdc6 Compare June 1, 2026 12:12
atanzu added 4 commits June 1, 2026 15:16
Support decoding of V3 payload in fakeintake, so we could test the new
payload type.

We use the intake's reader implementation to keep the code base
consistent.

Signed-off-by: Mark Kirichenko <mark.kirichenko@datadoghq.com>
Add V3 protocol support and create an explicit test case to verify that
we can operate in V2- or V3-only modes where all the metrics traffic
reaches the expected endpoint.

Signed-off-by: Mark Kirichenko <mark.kirichenko@datadoghq.com>
Add a test to verify that we can send OTel traffic via V3 protocol when
we configure the Agent to use V3.

Signed-off-by: Mark Kirichenko <mark.kirichenko@datadoghq.com>
@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from 9343df1 to ce03e7e Compare June 1, 2026 13:17
Signed-off-by: Mark Kirichenko <mark.kirichenko@datadoghq.com>
@atanzu atanzu force-pushed the mark.kirichenko/support-v3-in-fakeintake branch from ce03e7e to 4f0c860 Compare June 1, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/no-changelog No changelog entry needed internal Identify a non-fork PR long review PR is complex, plan time to review it qa/done QA done before merge and regressions are covered by tests team/agent-build team/agent-devx team/agent-metric-pipelines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants