-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inject trace context into EventBridge detail #7613
Inject trace context into EventBridge detail #7613
Conversation
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 52 metrics, 11 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.068 s) : 0, 1068430
Total [baseline] (8.596 s) : 0, 8595617
Agent [candidate] (1.067 s) : 0, 1066815
Total [candidate] (8.546 s) : 0, 8545761
section iast
Agent [baseline] (1.196 s) : 0, 1195874
Total [baseline] (9.1 s) : 0, 9099753
Agent [candidate] (1.194 s) : 0, 1194246
Total [candidate] (9.08 s) : 0, 9080195
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.196 s) : 0, 1196213
Total [baseline] (9.106 s) : 0, 9106375
Agent [candidate] (1.195 s) : 0, 1195181
Total [candidate] (9.128 s) : 0, 9128418
section iast_TELEMETRY_OFF
Agent [baseline] (1.194 s) : 0, 1193988
Total [baseline] (9.071 s) : 0, 9070844
Agent [candidate] (1.201 s) : 0, 1201008
Total [candidate] (9.102 s) : 0, 9102333
gantt
title insecure-bank - break down per module: candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (680.728 ms) : 0, 680728
BytebuddyAgent [candidate] (681.261 ms) : 0, 681261
GlobalTracer [baseline] (311.182 ms) : 0, 311182
GlobalTracer [candidate] (309.953 ms) : 0, 309953
AppSec [baseline] (54.466 ms) : 0, 54466
AppSec [candidate] (53.632 ms) : 0, 53632
Remote Config [baseline] (681.174 µs) : 0, 681
Remote Config [candidate] (664.592 µs) : 0, 665
Telemetry [baseline] (7.752 ms) : 0, 7752
Telemetry [candidate] (7.65 ms) : 0, 7650
section iast
BytebuddyAgent [baseline] (796.802 ms) : 0, 796802
BytebuddyAgent [candidate] (795.468 ms) : 0, 795468
GlobalTracer [baseline] (299.425 ms) : 0, 299425
GlobalTracer [candidate] (299.257 ms) : 0, 299257
AppSec [baseline] (53.73 ms) : 0, 53730
AppSec [candidate] (54.575 ms) : 0, 54575
IAST [baseline] (24.54 ms) : 0, 24540
IAST [candidate] (23.601 ms) : 0, 23601
Remote Config [baseline] (623.725 µs) : 0, 624
Remote Config [candidate] (620.152 µs) : 0, 620
Telemetry [baseline] (7.079 ms) : 0, 7079
Telemetry [candidate] (7.048 ms) : 0, 7048
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (796.531 ms) : 0, 796531
BytebuddyAgent [candidate] (795.639 ms) : 0, 795639
GlobalTracer [baseline] (299.617 ms) : 0, 299617
GlobalTracer [candidate] (299.346 ms) : 0, 299346
AppSec [baseline] (54.261 ms) : 0, 54261
AppSec [candidate] (55.699 ms) : 0, 55699
IAST [baseline] (24.426 ms) : 0, 24426
IAST [candidate] (23.002 ms) : 0, 23002
Remote Config [baseline] (612.262 µs) : 0, 612
Remote Config [candidate] (621.951 µs) : 0, 622
Telemetry [baseline] (7.056 ms) : 0, 7056
Telemetry [candidate] (7.141 ms) : 0, 7141
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (795.094 ms) : 0, 795094
BytebuddyAgent [candidate] (799.986 ms) : 0, 799986
GlobalTracer [baseline] (298.893 ms) : 0, 298893
GlobalTracer [candidate] (301.783 ms) : 0, 301783
AppSec [baseline] (54.516 ms) : 0, 54516
AppSec [candidate] (53.672 ms) : 0, 53672
IAST [baseline] (24.208 ms) : 0, 24208
IAST [candidate] (24.158 ms) : 0, 24158
Remote Config [baseline] (614.171 µs) : 0, 614
Remote Config [candidate] (631.496 µs) : 0, 631
Telemetry [baseline] (6.921 ms) : 0, 6921
Telemetry [candidate] (6.937 ms) : 0, 6937
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.065 s) : 0, 1065240
Total [baseline] (10.403 s) : 0, 10402574
Agent [candidate] (1.075 s) : 0, 1074500
Total [candidate] (10.407 s) : 0, 10406623
section appsec
Agent [baseline] (1.207 s) : 0, 1207202
Total [baseline] (10.659 s) : 0, 10658688
Agent [candidate] (1.205 s) : 0, 1204802
Total [candidate] (10.67 s) : 0, 10670419
section iast
Agent [baseline] (1.199 s) : 0, 1199009
Total [baseline] (10.878 s) : 0, 10877782
Agent [candidate] (1.203 s) : 0, 1203318
Total [candidate] (10.872 s) : 0, 10871976
section profiling
Agent [baseline] (1.264 s) : 0, 1264318
Total [baseline] (10.589 s) : 0, 10588713
Agent [candidate] (1.269 s) : 0, 1269140
Total [candidate] (10.632 s) : 0, 10632324
gantt
title petclinic - break down per module: candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (680.386 ms) : 0, 680386
BytebuddyAgent [candidate] (686.181 ms) : 0, 686181
GlobalTracer [baseline] (309.253 ms) : 0, 309253
GlobalTracer [candidate] (312.246 ms) : 0, 312246
AppSec [baseline] (53.744 ms) : 0, 53744
AppSec [candidate] (54.081 ms) : 0, 54081
Remote Config [baseline] (669.392 µs) : 0, 669
Remote Config [candidate] (659.727 µs) : 0, 660
Telemetry [baseline] (7.597 ms) : 0, 7597
Telemetry [candidate] (7.616 ms) : 0, 7616
section appsec
BytebuddyAgent [baseline] (702.756 ms) : 0, 702756
BytebuddyAgent [candidate] (699.985 ms) : 0, 699985
GlobalTracer [baseline] (308.532 ms) : 0, 308532
GlobalTracer [candidate] (307.395 ms) : 0, 307395
AppSec [baseline] (162.89 ms) : 0, 162890
AppSec [candidate] (162.95 ms) : 0, 162950
Remote Config [baseline] (649.062 µs) : 0, 649
Remote Config [candidate] (649.841 µs) : 0, 650
Telemetry [baseline] (8.6 ms) : 0, 8600
Telemetry [candidate] (9.596 ms) : 0, 9596
IAST [baseline] (20.168 ms) : 0, 20168
IAST [candidate] (21.084 ms) : 0, 21084
section iast
BytebuddyAgent [baseline] (798.969 ms) : 0, 798969
BytebuddyAgent [candidate] (802.282 ms) : 0, 802282
GlobalTracer [baseline] (299.877 ms) : 0, 299877
GlobalTracer [candidate] (301.311 ms) : 0, 301311
AppSec [baseline] (54.484 ms) : 0, 54484
AppSec [candidate] (55.487 ms) : 0, 55487
Remote Config [baseline] (611.943 µs) : 0, 612
Remote Config [candidate] (605.689 µs) : 0, 606
Telemetry [baseline] (7.023 ms) : 0, 7023
Telemetry [candidate] (7.082 ms) : 0, 7082
IAST [baseline] (24.362 ms) : 0, 24362
IAST [candidate] (22.795 ms) : 0, 22795
section profiling
ProfilingAgent [baseline] (96.448 ms) : 0, 96448
ProfilingAgent [candidate] (96.848 ms) : 0, 96848
BytebuddyAgent [baseline] (673.994 ms) : 0, 673994
BytebuddyAgent [candidate] (677.174 ms) : 0, 677174
GlobalTracer [baseline] (392.816 ms) : 0, 392816
GlobalTracer [candidate] (393.651 ms) : 0, 393651
AppSec [baseline] (54.422 ms) : 0, 54422
AppSec [candidate] (54.695 ms) : 0, 54695
Remote Config [baseline] (669.302 µs) : 0, 669
Remote Config [candidate] (657.938 µs) : 0, 658
Telemetry [baseline] (7.429 ms) : 0, 7429
Telemetry [candidate] (7.469 ms) : 0, 7469
Profiling [baseline] (96.472 ms) : 0, 96472
Profiling [candidate] (96.872 ms) : 0, 96872
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 16 unstable metrics. Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section baseline
no_agent (372.346 µs) : 352, 393
. : milestone, 372,
iast (479.337 µs) : 458, 501
. : milestone, 479,
iast_FULL (555.71 µs) : 535, 577
. : milestone, 556,
iast_GLOBAL (504.205 µs) : 483, 525
. : milestone, 504,
iast_HARDCODED_SECRET_DISABLED (485.47 µs) : 464, 507
. : milestone, 485,
iast_INACTIVE (451.36 µs) : 430, 473
. : milestone, 451,
iast_TELEMETRY_OFF (476.385 µs) : 455, 498
. : milestone, 476,
tracing (442.133 µs) : 422, 463
. : milestone, 442,
section candidate
no_agent (371.394 µs) : 351, 391
. : milestone, 371,
iast (483.586 µs) : 462, 505
. : milestone, 484,
iast_FULL (553.857 µs) : 533, 575
. : milestone, 554,
iast_GLOBAL (501.108 µs) : 480, 523
. : milestone, 501,
iast_HARDCODED_SECRET_DISABLED (486.174 µs) : 465, 507
. : milestone, 486,
iast_INACTIVE (445.89 µs) : 425, 467
. : milestone, 446,
iast_TELEMETRY_OFF (474.36 µs) : 453, 496
. : milestone, 474,
tracing (442.225 µs) : 422, 462
. : milestone, 442,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section baseline
no_agent (1.349 ms) : 1329, 1369
. : milestone, 1349,
appsec (1.704 ms) : 1681, 1726
. : milestone, 1704,
appsec_no_iast (1.723 ms) : 1699, 1747
. : milestone, 1723,
iast (1.478 ms) : 1456, 1501
. : milestone, 1478,
profiling (1.503 ms) : 1477, 1529
. : milestone, 1503,
tracing (1.465 ms) : 1440, 1489
. : milestone, 1465,
section candidate
no_agent (1.336 ms) : 1317, 1354
. : milestone, 1336,
appsec (1.717 ms) : 1693, 1741
. : milestone, 1717,
appsec_no_iast (1.704 ms) : 1680, 1729
. : milestone, 1704,
iast (1.465 ms) : 1442, 1487
. : milestone, 1465,
profiling (1.467 ms) : 1443, 1490
. : milestone, 1467,
tracing (1.468 ms) : 1443, 1493
. : milestone, 1468,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section baseline
no_agent (15.064 s) : 15064000, 15064000
. : milestone, 15064000,
appsec (15.24 s) : 15240000, 15240000
. : milestone, 15240000,
iast (18.927 s) : 18927000, 18927000
. : milestone, 18927000,
iast_GLOBAL (17.899 s) : 17899000, 17899000
. : milestone, 17899000,
profiling (15.34 s) : 15340000, 15340000
. : milestone, 15340000,
tracing (15.359 s) : 15359000, 15359000
. : milestone, 15359000,
section candidate
no_agent (15.099 s) : 15099000, 15099000
. : milestone, 15099000,
appsec (15.023 s) : 15023000, 15023000
. : milestone, 15023000,
iast (19.032 s) : 19032000, 19032000
. : milestone, 19032000,
iast_GLOBAL (18.018 s) : 18018000, 18018000
. : milestone, 18018000,
profiling (15.11 s) : 15110000, 15110000
. : milestone, 15110000,
tracing (15.214 s) : 15214000, 15214000
. : milestone, 15214000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.41.0-SNAPSHOT~41eed19728, baseline=1.41.0-SNAPSHOT~85b316b1c0
dateFormat X
axisFormat %s
section baseline
no_agent (1.465 ms) : 1453, 1476
. : milestone, 1465,
appsec (2.316 ms) : 2275, 2356
. : milestone, 2316,
iast (2.044 ms) : 1994, 2093
. : milestone, 2044,
iast_GLOBAL (2.106 ms) : 2054, 2157
. : milestone, 2106,
profiling (1.926 ms) : 1885, 1966
. : milestone, 1926,
tracing (1.899 ms) : 1860, 1937
. : milestone, 1899,
section candidate
no_agent (1.462 ms) : 1451, 1474
. : milestone, 1462,
appsec (2.309 ms) : 2269, 2349
. : milestone, 2309,
iast (2.071 ms) : 2020, 2122
. : milestone, 2071,
iast_GLOBAL (2.103 ms) : 2052, 2155
. : milestone, 2103,
profiling (2.382 ms) : 2199, 2566
. : milestone, 2382,
tracing (1.915 ms) : 1875, 1954
. : milestone, 1915,
|
public static final String BUS_TAG = "bus"; | ||
private static final DDCache<String, String> BUS_TAG_CACHE = DDCaches.newFixedSizeCache(32); | ||
private static final Function<String, String> BUS_TAG_PREFIX = new StringPrefix("bus:"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not 100% sure what this does, but I was copying what the implementation for SNS did. Are these tags to be used by users for querying purposes, or does it do something else?
dd-trace-core/src/main/java/datadog/trace/lambda/MoshiJsonAdapter.java
Outdated
Show resolved
Hide resolved
internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentTracer.java
Outdated
Show resolved
Hide resolved
.../java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeClientInstrumentation.java
Outdated
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
.../aws-java-sns-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sns/SnsInterceptor.java
Show resolved
Hide resolved
88b77fa
to
00be0b8
Compare
… env var. Add async client tests.
00be0b8
to
0aa00d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
739383e
to
a409a15
Compare
Quick question about the new label: is EventBridge falls under AWS SDK? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
...0/src/main/java/datadog/trace/instrumentation/aws/v2/eventbridge/EventBridgeInterceptor.java
Outdated
Show resolved
Hide resolved
Yes it does |
…ext (#6096) ## Summary of changes This creates a new instrumentation for EventBridge and intercepts `PutEvents` and `PutEventsAsync` to inject trace context. This allows the agent to combine spans from a distributed (serverless) architecture into a single trace. This PR only injects trace context. I'm working on [PR 1](DataDog/datadog-agent#29414) and [PR 2](DataDog/datadog-agent#29551) to update the Lambda extension to use this trace context to create EventBridge spans. I am also working on a similar PR in [dd-trace-java](DataDog/dd-trace-java#7613) and dd-trace-go. ## Reason for change SNS and SQS are already supported, and the tracer currently injects trace context into message attributes fields for them. However, EventBridge wasn't supported, and this PR aims to fix this problem. ## Implementation details I followed the [documentation](https://github.com/DataDog/dd-trace-dotnet/blob/master/docs/development/AutomaticInstrumentation.md) to create an instrumentation. Much of the logic was mirrored from the [existing implementation](https://github.com/DataDog/dd-trace-dotnet/tree/master/tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/AWS/SNS) of SNS, since EventBridge and SNS are extremely similar. Overall, AWS's EventBridge API is lacking some features, so we have to do some hacky solutions. - SNS and SQS call their custom input field messageAttributes, and EventBridge calls it detail - Unlike SNS and SQS, the detail field is given as a raw string. Therefore, we have to manually modify the detail string using StringBuilder. - The agent has no reliable way of getting the start time of the EventBridge span, so the tracer has to put the current time into `detail[_datadog]` under the header `x-datadog-start-time` - The EventBridge API has no way of getting the EventBridge bus name, so the tracer has to put the bus name (which is used to create the span resource name) into `detail[_datadog]` under the header `x-datadog-resource-name` ## Test coverage I added system tests for SNS/SQS: DataDog/system-tests#3204 I added [unit tests](d05eb4c) and [integration tests](5ccd8b7). Unit tests can be ran with: ``` cd tracer dotnet test ./test/Datadog.Trace.ClrProfiler.Managed.Tests ``` Integration tests can be ran with these commands: ``` cd tracer # Start docker localstock docker run --rm -it -p 4566:4566 -p 4571:4571 -e SERVICES=events localstack/localstack # Run integation tests ./build.sh BuildAndRunOSxIntegrationTests -buildConfiguration Debug -framework net6.0 -Filter AwsEventBridgeTests -SampleName Samples.AWS.EventBridge ``` I also did manual testing: <img width="505" alt="Screenshot 2024-09-30 at 11 00 47 AM" src="https://github.com/user-attachments/assets/bdf5d516-8b46-4138-ac25-c45d1822dc56"> ## Other details There are lots of diffs and files changed. I recommend reviewers to review the PR commit by commit. All the autogenerated files were added in a single commit, which should make the review process less overwhelming. <!--⚠️ Note: where possible, please obtain 2 approvals prior to merging. Unless CODEOWNERS specifies otherwise, for external teams it is typically best to have one review from a team member, and one review from apm-dotnet. Trivial changes do not require 2 reviews. --> --------- Co-authored-by: Steven Bouwkamp <[email protected]>
What Does This Do
This creates a new instrumentation for EventBridge and intercepts
PutEventsRequest
to inject trace context. This allows the agent to combine spans from a distributed (serverless) architecture into a single trace.This PR only injects trace context. I'm working on PR 1 and PR 2 to update the Lambda extension to use this trace context to create EventBridge spans.
I am working on similar PRs in dd-trace-dotnet and dd-trace-go.
Motivation
SNS and SQS are already supported, and the tracer currently injects trace context into message attributes fields for them. However, EventBridge wasn't supported, and this PR aims to fix this problem.
Additional Notes
Overall, AWS's EventBridge API is lacking some features, so we have to do some hacky solutions.
messageAttributes
, and EventBridge calls itdetail
detail
field is given as a raw string. Therefore, we have to manually modify thedetail
string using StringBuilder.detail[_datadog]
asx-datadog-start-time
detail[_datadog]
asx-datadog-resource-name
Traces before these changes
Lambda --> EventBridge --> Lambda
Two different traces. Second trace is missing an EventBridge span
Lambda --> EventBridge --> SQS --> Lambda
Missing EventBridge span
Lambda --> EventBridge --> SNS --> Lambda
Missing EventBridge span
Traces after these (and agent's) changes
Lambda --> EventBridge --> Lambda
Lambda --> EventBridge --> SQS --> Lambda
Lambda --> EventBridge --> SNS --> Lambda
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: SVLS-5666