-
Notifications
You must be signed in to change notification settings - Fork 392
[PROF-11524] Upgrade libdatadog dependency to version 18.1 #4577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-11524] Upgrade libdatadog dependency to version 18.1 #4577
Conversation
Datadog ReportBranch report: ✅ 0 Failed, 21496 Passed, 1299 Skipped, 4m 6.7s Total Time |
…on every report Reusing an `encoded_profile` object will no longer be possible with libdatadog 17+. This does change what the benchmark was covering, so comparing with prior results is not particularly relevant. I did consider removing this benchmark (we haven't used it for a long time for anything in particular), but on the other hand these microbenchmarks that exercise libdatadog have been useful sometimes so why not.
This tag will be set automatically by libdatadog 17+ so we no longer need to set it. See DataDog/libdatadog#954 for the change on the libdatadog side.
Still missing: cancellation token changes.
The removed spec for the invalid sample no longer triggers on release builds of libdatadog (it's only for debug builds).
Still discussing with folks what the next steps are on this one.
With the move to handles, I'm unsure if our previous approach was correct, so I've tweaked our usage and added a bit more validation to catch any unexpected issues.
It took us a while to understand why crashtracking was not working correctly with modern versions of libdatadog. It turns out the problem was that we were setting a too low timeout for Ruby crash reports. I suspect this issue has been hidden for a while because before libdatadog 17, crashtracking was almost always giving up on stack symbolication ( DataDog/libdatadog#991 ), which was a much faster path, and thus we didn't notice that the timeout required adjusting until libdatadog 17 came around and fixed symbolization (making it slower). I first tried with the libdatadog default of 5 seconds and it still wasn't enough on my machine. Then I bumped it slightly higher and that seems to be enough. I'm surprised symbolication is this expensive...
e8bf7f4
to
a9a715c
Compare
This test is currently failing pending some work on the libdatadog side.
Update I've:
We have one last blocker (which I've listed in the PR notes), and we should be good to go. |
With the new disable/reconfigure APIs we can finally restore the full correct behavior to the crashtracker.
…#1066) # What does this PR do? This PR adds the following new APIs to the crashtracker: * `ddog_crasht_enable` * `ddog_crasht_disable` * `ddog_crasht_reconfigure` # Motivation Prior to #1000, it was possible to: * reconfigure the crashtracker by calling init again * stop the crashtracker Both features were removed as it could cause issues with signal handlers and chaining. But, we still need to be able to update the configuration as for instance the following can change at run-time for Ruby and need to be updated: * Endpoint * Tags This PR restores some of the behavior removed in #1000, without the unsafety of touching signal handlers: * `enable`/`disable` just skip the crashtracker signal handling code, but never uninstall it * `reconfigure` does the same as `init` BUT without touching the signal handlers # Additional Notes This PR supercedes #1017 . # How to test the change? I've validated this branch with DataDog/dd-trace-rb#4577 on the Ruby side, and I'm able to get a green test suite for the Ruby crashtracker. --------- Co-authored-by: Daniel Schwartz-Narbonne <[email protected]>
…ease **What does this PR do?** This PR bumps the libdatadog version from 18.0.0 to 18.1.0 in preparation for release. We're going from v18.0 to v18.1 as I'm not aware of any backwards-incompatible changes being merged to master. **Motivation:** Ruby has been stuck on an old libdatadog release due to a few API changes, and I want to finally move it to latest libdatadog. **Additional Notes:** I did a diff of the v18.0 to v18.1 headers to validate there's no backwards-incompatible changes using the `difft` tool: ``` library-config.h --- C++ No changes. blazesym.h --- C++ No changes. profiling.h --- C++ No changes. crashtracker.h --- 1/2 --- C++ 57 57 extern "C" { 58 58 #endif // __cplusplus .. 59 .. 60 /** .. 61 * Disables the crashtracker. .. 62 * Note that this does not restore the old signal handlers, but rather turns crash-tracking into a .. 63 * no-op, and then chains the old handlers. This means that handlers registered after the .. 64 * crashtracker will continue to work as expected. .. 65 * .. 66 * # Preconditions .. 67 * None .. 68 * # Safety .. 69 * None .. 70 * # Atomicity .. 71 * This function is atomic and idempotent. Calling it multiple times is allowed. .. 72 */ .. 73 DDOG_CHECK_RETURN struct ddog_VoidResult ddog_crasht_disable(void); .. 74 .. 75 /** .. 76 * Enables the crashtracker, if it had been previously disabled. .. 77 * If crashtracking has not been initialized, this function will have no effect. .. 78 * .. 79 * # Preconditions .. 80 * None .. 81 * # Safety .. 82 * None .. 83 * # Atomicity .. 84 * This function is atomic and idempotent. Calling it multiple times is allowed. .. 85 */ .. 86 DDOG_CHECK_RETURN struct ddog_VoidResult ddog_crasht_enable(void); 59 87 60 88 /** 61 89 * Reinitialize the crash-tracking infrastructure after a fork. crashtracker.h --- 2/2 --- C++ 98 126 struct ddog_crasht_ReceiverConfig receiver_config, 99 127 struct ddog_crasht_Metadata metadata); .. 128 .. 129 /** .. 130 * Reconfigure the crashtracker and re-enables it. .. 131 * If the crashtracker has not been initialized, this function will have no effect. .. 132 * .. 133 * # Preconditions .. 134 * None. .. 135 * # Safety .. 136 * Crash-tracking functions are not reentrant. .. 137 * No other crash-handler functions should be called concurrently. .. 138 * # Atomicity .. 139 * This function is not atomic. A crash during its execution may lead to .. 140 * unexpected crash-handling behaviour. .. 141 */ .. 142 DDOG_CHECK_RETURN .. 143 struct ddog_VoidResult ddog_crasht_reconfigure(struct ddog_crasht_Config config, .. 144 struct ddog_crasht_ReceiverConfig receiver_config, .. 145 struct ddog_crasht_Metadata metadata); 100 146 101 147 /** 102 148 * Initialize the crash-tracking infrastructure without launching the receiver. data-pipeline.h --- C++ No changes. telemetry.h --- C++ No changes. common.h --- C++ No changes. ``` **How to test the change?** I've tested the libdatadog 18.1 release locally using the Ruby test suite and DataDog/dd-trace-rb#4577 .
…ease (#1075) **What does this PR do?** This PR bumps the libdatadog version from 18.0.0 to 18.1.0 in preparation for release. We're going from v18.0 to v18.1 as I'm not aware of any backwards-incompatible changes being merged to master. **Motivation:** Ruby has been stuck on an old libdatadog release due to a few API changes, and I want to finally move it to latest libdatadog. **Additional Notes:** I did a diff of the v18.0 to v18.1 headers to validate there's no backwards-incompatible changes using the `difft` tool: ``` library-config.h --- C++ No changes. blazesym.h --- C++ No changes. profiling.h --- C++ No changes. crashtracker.h --- 1/2 --- C++ 57 57 extern "C" { 58 58 #endif // __cplusplus .. 59 .. 60 /** .. 61 * Disables the crashtracker. .. 62 * Note that this does not restore the old signal handlers, but rather turns crash-tracking into a .. 63 * no-op, and then chains the old handlers. This means that handlers registered after the .. 64 * crashtracker will continue to work as expected. .. 65 * .. 66 * # Preconditions .. 67 * None .. 68 * # Safety .. 69 * None .. 70 * # Atomicity .. 71 * This function is atomic and idempotent. Calling it multiple times is allowed. .. 72 */ .. 73 DDOG_CHECK_RETURN struct ddog_VoidResult ddog_crasht_disable(void); .. 74 .. 75 /** .. 76 * Enables the crashtracker, if it had been previously disabled. .. 77 * If crashtracking has not been initialized, this function will have no effect. .. 78 * .. 79 * # Preconditions .. 80 * None .. 81 * # Safety .. 82 * None .. 83 * # Atomicity .. 84 * This function is atomic and idempotent. Calling it multiple times is allowed. .. 85 */ .. 86 DDOG_CHECK_RETURN struct ddog_VoidResult ddog_crasht_enable(void); 59 87 60 88 /** 61 89 * Reinitialize the crash-tracking infrastructure after a fork. crashtracker.h --- 2/2 --- C++ 98 126 struct ddog_crasht_ReceiverConfig receiver_config, 99 127 struct ddog_crasht_Metadata metadata); .. 128 .. 129 /** .. 130 * Reconfigure the crashtracker and re-enables it. .. 131 * If the crashtracker has not been initialized, this function will have no effect. .. 132 * .. 133 * # Preconditions .. 134 * None. .. 135 * # Safety .. 136 * Crash-tracking functions are not reentrant. .. 137 * No other crash-handler functions should be called concurrently. .. 138 * # Atomicity .. 139 * This function is not atomic. A crash during its execution may lead to .. 140 * unexpected crash-handling behaviour. .. 141 */ .. 142 DDOG_CHECK_RETURN .. 143 struct ddog_VoidResult ddog_crasht_reconfigure(struct ddog_crasht_Config config, .. 144 struct ddog_crasht_ReceiverConfig receiver_config, .. 145 struct ddog_crasht_Metadata metadata); 100 146 101 147 /** 102 148 * Initialize the crash-tracking infrastructure without launching the receiver. data-pipeline.h --- C++ No changes. telemetry.h --- C++ No changes. common.h --- C++ No changes. ``` **How to test the change?** I've tested the libdatadog 18.1 release locally using the Ruby test suite and DataDog/dd-trace-rb#4577 .
This test started failing when I added a sleep(1) so I think we're very close to the limit (of 10 seconds), so let's create a bit of headroom to avoid future flakiness.
**What does this PR do?** This PR includes the changes documented in the "Releasing a new version to rubygems.org" part of the README: https://github.com/datadog/libdatadog/tree/main/ruby#releasing-a-new-version-to-rubygemsorg It also includes an updated to the `gem_packaging.rb` test to add the new `INTERNED_EMPTY_STRING` public symbol added in #917 . **Motivation:** Enable Ruby to use libdatadog v18.1.0. This includes A LOT of changes to profiling, crashtracking, configuration, process discovery etc that have been pending for Ruby for too long. **Additional Notes:** The `INTERNED_EMPTY_STRING` new symbol seems ok to have (e.g. I don't expect it to clash with other libraries), but just in case I will discuss with @danielsn prefixing it with `ddog_`. Also it's cool to see our tests working! A new symbol that's not prefixed with the expected strings showed up, and we caught it :) **How to test the change?** I've tested this release locally using the changes in DataDog/dd-trace-rb#4577 . As a reminder, new libdatadog releases don't get automatically picked up by dd-trace-rb, so the PR that bumps the dependency will also test this release against all supported Ruby versions.
**What does this PR do?** This PR includes the changes documented in the "Releasing a new version to rubygems.org" part of the README: https://github.com/datadog/libdatadog/tree/main/ruby#releasing-a-new-version-to-rubygemsorg It also includes an updated to the `gem_packaging.rb` test to add the new `INTERNED_EMPTY_STRING` public symbol added in #917 . **Motivation:** Enable Ruby to use libdatadog v18.1.0. This includes A LOT of changes to profiling, crashtracking, configuration, process discovery etc that have been pending for Ruby for too long. **Additional Notes:** The `INTERNED_EMPTY_STRING` new symbol seems ok to have (e.g. I don't expect it to clash with other libraries), but just in case I will discuss with @danielsn prefixing it with `ddog_`. Also it's cool to see our tests working! A new symbol that's not prefixed with the expected strings showed up, and we caught it :) **How to test the change?** I've tested this release locally using the changes in DataDog/dd-trace-rb#4577 . As a reminder, new libdatadog releases don't get automatically picked up by dd-trace-rb, so the PR that bumps the dependency will also test this release against all supported Ruby versions.
The latest release is up on rubygems.org so we're ready to bump these.
BenchmarksBenchmark execution time: 2025-05-30 13:30:54 Comparing candidate commit 2cf6513 in PR branch Found 5 performance improvements and 0 performance regressions! Performance is the same for 35 metrics, 5 unstable metrics. scenario:profiling - gvl benchmark samples
scenario:profiling - profiler gc
scenario:profiling - sample timeline=false
scenario:profiling - sample+serialize retain_every=10 heap_samples=false heap_size=false heap_sample_every=1 skip_end_gc=false
scenario:profiling - stack collector
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4577 +/- ##
==========================================
- Coverage 97.64% 97.63% -0.02%
==========================================
Files 1470 1470
Lines 87707 87711 +4
Branches 4547 4547
==========================================
- Hits 85644 85636 -8
- Misses 2063 2075 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Updates the native bits in the gem (profiling, crashtracking) to be able to work with the latest libdatadog 18.
I'm opening it as a draft so we can use it to test the upcoming libdatadog 18.1 release. Once that release is out and on rubygems.org, I plan to:Push the actual gem dependency change on this PRDo a pass on the TODO/FIXME things that are still missingAll ready and in good shape!
Motivation:
There's a number of fixes and additions in libdatadog that we want to make use of.
Change log entry
Yes. Upgrade libdatadog dependency to 18.1
Additional Notes:
With this PR, I'm able to build both crashtracking AND profiling with libdatadog master as of 11th of April. I'm also able to get a green test suite for profiling, but not for crashtracking: I've reported the issue to the folks working on that, so they're aware.All green!
Blockers for this PR:
Being able to update libdatadog crashtracking configration (endpoint, tags)Fixed!How to test the change?
Once everything is in place, our existing test coverage will be enough to validate these changes. Until then it's expected that CI is going to be red.