Skip to content

Conversation

@andrewlock
Copy link
Member

@andrewlock andrewlock commented Oct 28, 2025

Summary of changes

  • This is the big one
  • Update services to dynamically update when mutable settings or exporter settings change
  • Stop rebuilding everything when there's manual/remote configuration

Reason for change

This is the "endpoint" that we've been heading for - services only being disposed/rebuilt at the end of the app, and otherwise only rebuilding the necessary parts. For example - we don't need to tear down all the API factories when a customer changes a global tag via remote config; they only need to change if the ExporterSettings change.

The hope is that overall this reduces the overhead of using configuration in code and/or remote configuration, while also reducing the number of issues due to managing disposal of services.

Implementation details

Overall, this PR is kind of a pain. Moving from the "rebuild everything" to "reconfigure each service" couldn't be done piecemeal, so this is the one-shot PR. What's more, different services need different patterns (though we can probably consolidate some of them, this has taken a lot of work and I likely changed patterns unnecessarily in some places).

In general, there's a couple of patterns:

  • CI Vis doesn't let you change settings at runtime, so it never needs to respond to changes. It always just uses the "initial" settings
  • Debugger today doesn't respond to changes at runtime (except its own dynamic config), so for now we ignore Debugger too as it's not really a regression. I hope we can fix this soon though.
  • I've introduced the concept of Managed* versions of some services
    • These services generally "wrap" the existing type, delegating access to the underlying service, and handling settings changes
  • Many services only care about a sub-set of mutable settings, so they only update if they need to
  • Somewhat annoyingly, setting updates occur on a background thread, so we need to be careful about thread safety. Where necessary (most places) I've made sure access to a now-mutable service is done using Volatile.Read() (to ensure changes are visible) and are generally cached to a local variable (as the underlying field may be updated in the background).

Test coverage

In the vast majority of places, this should be covered by existing tests

I plan to add some additional integration tests around reconfiguring and a bunch of manual testing to make sure I'm confident.

Other details

I strongly recommend reviewing commit-by-commit. They're generally self-contained, and hopefully simple enough to understand one commit at a time.

https://datadoghq.atlassian.net/browse/LANGPLAT-819

Part of a config stack

This isn't the final PR in the stack, as there will be a bunch of cleaning up to do, but it's the final "implementation" PR

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@andrewlock andrewlock force-pushed the andrew/settings/5-remove-mutablesettings branch 3 times, most recently from 2bc63f6 to 34f0d90 Compare October 28, 2025 15:02
@andrewlock andrewlock force-pushed the andrew/settings/5-move-mutable-settings-off-tracer-settings branch from e347879 to 8c472a5 Compare October 28, 2025 15:02
@andrewlock andrewlock force-pushed the andrew/settings/5-remove-mutablesettings branch from 34f0d90 to f1e1c7e Compare October 28, 2025 15:20
@andrewlock andrewlock force-pushed the andrew/settings/5-move-mutable-settings-off-tracer-settings branch from 8c472a5 to 8e19e3a Compare October 28, 2025 15:20
@datadog-official

This comment has been minimized.

@andrewlock andrewlock force-pushed the andrew/settings/5-remove-mutablesettings branch from f1e1c7e to c2b6a1c Compare October 28, 2025 18:13
@andrewlock andrewlock requested review from a team as code owners October 28, 2025 18:13
@andrewlock andrewlock requested review from link04 and removed request for a team October 28, 2025 18:13
@andrewlock andrewlock force-pushed the andrew/settings/5-move-mutable-settings-off-tracer-settings branch from 8e19e3a to 7940c31 Compare October 28, 2025 18:13
@andrewlock andrewlock force-pushed the andrew/settings/5-remove-mutablesettings branch from c2b6a1c to 48c7644 Compare October 29, 2025 08:57
@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Oct 29, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (83ms)  : 77, 88
     .   : milestone, 83,
    master - mean (79ms)  : 76, 82
     .   : milestone, 79,

    section Baseline
    This PR (7724) - mean (78ms)  : 72, 83
     .   : milestone, 78,
    master - mean (74ms)  : 71, 78
     .   : milestone, 74,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (1,200ms)  : crit, 1152, 1248
     .   : crit, milestone, 1200,
    master - mean (1,113ms)  : 1031, 1195
     .   : milestone, 1113,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (123ms)  : 117, 130
     .   : milestone, 123,
    master - mean (119ms)  : 116, 122
     .   : milestone, 119,

    section Baseline
    This PR (7724) - mean (122ms)  : 115, 129
     .   : milestone, 122,
    master - mean (117ms)  : 113, 121
     .   : milestone, 117,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (907ms)  : crit, 864, 949
     .   : crit, milestone, 907,
    master - mean (795ms)  : 773, 817
     .   : milestone, 795,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (104ms)  : 100, 108
     .   : milestone, 104,
    master - mean (105ms)  : 102, 107
     .   : milestone, 105,

    section Baseline
    This PR (7724) - mean (103ms)  : 99, 107
     .   : milestone, 103,
    master - mean (102ms)  : 99, 106
     .   : milestone, 102,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (800ms)  : crit, 721, 880
     .   : crit, milestone, 800,
    master - mean (748ms)  : 719, 776
     .   : milestone, 748,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (103ms)  : 101, 105
     .   : milestone, 103,
    master - mean (103ms)  : 100, 106
     .   : milestone, 103,

    section Baseline
    This PR (7724) - mean (102ms)  : 99, 105
     .   : milestone, 102,
    master - mean (102ms)  : 99, 105
     .   : milestone, 102,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (763ms)  : crit, 724, 802
     .   : crit, milestone, 763,
    master - mean (709ms)  : 689, 729
     .   : milestone, 709,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (197ms)  : 195, 200
     .   : milestone, 197,
    master - mean (198ms)  : 193, 202
     .   : milestone, 198,

    section Baseline
    This PR (7724) - mean (194ms)  : 190, 197
     .   : milestone, 194,
    master - mean (195ms)  : 189, 201
     .   : milestone, 195,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (1,261ms)  : crit, 1219, 1304
     .   : crit, milestone, 1261,
    master - mean (1,172ms)  : 1102, 1243
     .   : milestone, 1172,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (279ms)  : 275, 282
     .   : milestone, 279,
    master - mean (281ms)  : 274, 289
     .   : milestone, 281,

    section Baseline
    This PR (7724) - mean (281ms)  : 274, 287
     .   : milestone, 281,
    master - mean (283ms)  : 269, 296
     .   : milestone, 283,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (1,004ms)  : 937, 1071
     .   : milestone, 1004,
    master - mean (954ms)  : 905, 1003
     .   : milestone, 954,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (270ms)  : 266, 273
     .   : milestone, 270,
    master - mean (270ms)  : 266, 274
     .   : milestone, 270,

    section Baseline
    This PR (7724) - mean (270ms)  : 264, 277
     .   : milestone, 270,
    master - mean (270ms)  : 263, 276
     .   : milestone, 270,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (985ms)  : 911, 1058
     .   : milestone, 985,
    master - mean (940ms)  : 882, 999
     .   : milestone, 940,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Bailout
    This PR (7724) - mean (270ms)  : 265, 274
     .   : milestone, 270,
    master - mean (273ms)  : 263, 282
     .   : milestone, 273,

    section Baseline
    This PR (7724) - mean (270ms)  : 264, 275
     .   : milestone, 270,
    master - mean (270ms)  : 264, 276
     .   : milestone, 270,

    section CallTarget+Inlining+NGEN
    This PR (7724) - mean (931ms)  : crit, 853, 1008
     .   : crit, milestone, 931,
    master - mean (861ms)  : 837, 884
     .   : milestone, 861,

Loading

Also:
- slight refactor of LogFormatter to reduce some allocation
- ignore "previous" when creating DirectLogSubmissionManager (seeing as that won't be a thing soon)
…n't respond to changes

I left it like this because the debugger already doesn't respond to changes like other services do
- Move statsd instance creation to separate factory
- Create a StatsdManager to handle automatic updating in response to setting changes
- Always create a statsd instance, as it's hard to know if we're _ever_ going to need one, and reduces some of the compexity
This isn't necessary with the current design, and it causes issues today
This manifested in a different test where an empty string passed as the DogStatsD socket was causing us to use UDS even through it's not available and not _really_ set
Make sure we can't dispose a stats consumer that's in use (as it will throw)
Rework to use a "lease" mechanism to track usages
Make passing in a statsmanager required
@andrewlock andrewlock force-pushed the andrew/settings/5-remove-mutablesettings branch from 1b1f494 to f94e51b Compare October 31, 2025 18:04
@andrewlock andrewlock requested review from a team as code owners October 31, 2025 18:04
@andrewlock andrewlock force-pushed the andrew/settings/5-move-mutable-settings-off-tracer-settings branch from 7940c31 to 2dfae3d Compare October 31, 2025 18:04
bouwkast pushed a commit that referenced this pull request Oct 31, 2025
## Summary of changes

- Trim whitespace in the exporter settings values before trying to use
them
- Fix a bug where having a whitespace in the `DD_DOGSTATSD_SOCKET`
variable would cause incorrect (and invalid) UDS config

## Reason for change

This manifested in a different test in a branch, where an empty string
passed in `DD_DOGSTATSD_SOCKET` was causing us to use UDS for metrics
even through it's not available and not _really_ set.

## Implementation details

- `Trim()` the string variables (I don't think there's a good reason not
to?)
- As we're trimming, we can switch to `IsNullOrEmpty` instead of
`IsWhitespace`
- Treat empty `DD_DOGSTATSD_SOCKET` as not set

Alternatively, we could not trim, stick to using `IsWhitespace`, and
just update the check I fixed to use `IsWhitespace` 🤷‍♂️

## Test coverage

Added a couple of unit tests. Only one test is for this specific issue,
the others were just part of my investigation, and seemed reasonable.

## Other details

Discovered as part of
#7724 and
https://datadoghq.atlassian.net/browse/LANGPLAT-819
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants