Skip to content

Conversation

anahatAWS
Copy link
Contributor

@anahatAWS anahatAWS commented Jun 3, 2025

Issue

The current ADOT Java SDK implementation relies on a combination of OpenTelemetry SPI and Git patches to extend the OTel SDK functionality. This approach presents several challenges:

  • Reduced modularity and maintainability
  • Increased risk of errors during OTel SDK version upgrades
  • Manual intervention required for patch management
  • Limited ecosystem compatibility with upstream OpenTelemetry
  • Difficulty in extending functionality for users

Description of changes:

This PR introduces a new SPI-based implementation for AWS SDK v2.2 instrumentation that:

  1. Leverages OpenTelemetry's InstrumentationModule extension mechanism
  2. Replaces Git patches with proper SPI implementation
  3. Ensures AWS-specific instrumentation runs after core OpenTelemetry instrumentation
  4. Maintains all existing functionality while improving maintainability

The SPI instrumentation module is under 'instrumentation/aws-sdk.' There is also a custom 'TracingExecutionInterceptor' for this implementation for ensure correct instrumentation for files that required patching previously.

Testing

To validate the SPI implementation, I utilized the existing AppSignals contract tests. These tests were originally designed to verify AWS SDK v2 instrumentation behaviour. The tests specifically assert that AWS-specific span attributes are correctly captured in the mock collector (i.e. tests git patching logic).

By removing the AWS SDK v2.2 git patching logic from the patching script and running these same contract tests, I could verify that the SPI implementation maintains identical instrumentation behaviour without requiring patches. The successful passing of all AWS SDK v2 contract tests confirms that the SPI implementation correctly generates the expected spans and attributes, effectively replacing the previous patching mechanism.

Benefits

  • Improved Maintainability: Clear separation between OTel core and AWS-specific instrumentation
  • Better Extensibility: Users can more easily extend or modify AWS-specific behavior
  • Reduced Risk: Eliminates manual patching during OTel upgrades
  • Enhanced Compatibility: Better alignment with OpenTelemetry's extension mechanisms
  • Clearer Code Organization: More intuitive structure for future contributions

Next Steps

  • Add similar SPI implementation for AWS SDK v1.1
  • Update documentation to reflect new extension mechanism

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@anahatAWS anahatAWS requested a review from a team as a code owner June 3, 2025 18:42
@anahatAWS
Copy link
Contributor Author

I have updated my branch with changes to account for the feedback given. This implementation of the aws-sdk v2.2 instrumentation passes the v2 contract tests in ADOT as well.

dependencies {

compileOnly("software.amazon.awssdk:json-utils:2.17.0")
compileOnly("com.google.code.findbugs:jsr305:3.0.2")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lines 26 and 27 are needed for AwsJsonProtocolFactoryAccess and BedrockJsonParser


private final FieldMapper fieldMapper = new FieldMapper();

// This is the latest point we can obtain the Sdk Request after it is modified by the upstream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this may not be the right understanding and we can double check with Ping.

My understanding: The upstream instrumentation precedes our patching, as defined by the sequence in AdotAwsSdkInstrumentationModule. Within the AWS SDK request lifecycle, our patching intercepts the beforeTransmission and afterUnmarshalling events to inject AWS attributes according to the established instrumentation order.

request(AWS_STATE_MACHINE_ARN.getKey(), "stateMachineArn"),
request(AWS_STEP_FUNCTIONS_ACTIVITY_ARN.getKey(), "activityArn")),

// SNS(request(AWS_SNS_TOPIC_ARN.getKey(), "TopicArn")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help remove it? As it is moved to line 68. Thanks.

// This is the earliest point we can obtain the Sdk Response before it is modified by the upstream
// interceptor. This ensures the execution attribute (AWS_SDK_REQUEST_ATTRIBUTE) added in by the
// interceptor is handled only by this interceptor, and not the upstream interceptor.
@Override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this action to afterExecution as done by upstream so that our changes/updates come after upstream? So far, we don't modify attributes set by OTel, so the order doesn't matter too much. But if we did, based on this implementation, the modified attributes in afterUnmarshalling will be overwritten by OTel in afterExecution step.

Update:
In Otel interceptor, it clearsAWS_SDK_REQUEST_ATTRIBUTE in executionAttributes duringafterExecution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had tried to instrument afterExecution initially but the upstream clears all the execution attributes. I think using afterUnmarshalling ensures that the interceptor that put in the execution attributes last (i.e. AdotTracingExecutionInterceptor) should be the first to handle the execution attributes. This also ensures the otel span is available to modify at this point of interception.

## ADOT AWS SDK Instrumentation

### Overview
This instrumentation is an SPI-based implementation, in place of ADOT's git patches, to extend the AWS SDK within the upstream OpenTelemetry Instrumentation for Java.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing detailed explanation. It helps reviewing the PR easier. However, for people who have no context on the patching mechanism, especially after we implemented SPI, the term "git patches" might be unclear. Let's remove those references and focus on describing the specific patches being added.

compileOnly("software.amazon.awssdk:lambda:2.2.0")
compileOnly("software.amazon.awssdk:aws-json-protocol:2.2.0")
compileOnly("software.amazon.awssdk:sfn:2.2.0")
compileOnly("software.amazon.awssdk:lambda:2.2.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated from line 39

testImplementation("org.assertj:assertj-core:3.24.2")
testImplementation("org.mockito:mockito-junit-jupiter:3.12.4")

// AWS SDK test dependencies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove them as they are not used for BedrockJsonParserTest


testImplementation("io.opentelemetry.javaagent:opentelemetry-javaagent-extension-api")
testImplementation("io.opentelemetry.instrumentation:opentelemetry-instrumentation-api")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review the dependencies to remove the non-effective ones. Thanks.

@anahatAWS anahatAWS closed this Jun 27, 2025
@anahatAWS anahatAWS deleted the spi branch August 7, 2025 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants