Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added segment EndTime check on close to support VirtualSegments… #322

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

feat: added segment EndTime check on close to support VirtualSegments… #322

wants to merge 1 commit into from

Conversation

tsypuk
Copy link

@tsypuk tsypuk commented Jul 23, 2021

…(SNS-SQS fanout trace recover)

Issue #, if available:
#321

Description of changes:
When event is sent to SNS from awslib wrapped with xray using WithContext, XRAY adds TraceHeader X-Amzn-Trace-Id.
Current it is not supported by AWS Lambda and xray library to automatically recover the TraceHeader and assign it as Parent SegmentID for the current Runtime (Lambda), etc. It is possible to recover trace header manually, create new Segment and assign TraceHeader SegmentID as ParentID for current segment. This will work for SQS integration and xray propagation scenario.

But in case if you are using SNS - to - SQS fanout: SNS propagates xray TraceHeader, but nor SNS nor SQS create Segment is XRAY. For such scenario it is possible to receive TraceHeader in SQS event. This TraceHeader is linked to SNS Service and its segments. SQS event contains information that can be used to create SQS Virtual segment (there is information in SQSEvent with metrics about both SNS and SQS timing, and based on them we can create exact startTime and endTime for SQS Virtual segment), map it to SNS pre-fanout Segment and fully recreate the trace chain. All further Segments in the current context will be mapped to SQS VirtualSegment.

The problem is that Xray lib automatically updates EndTime on Segment Close operation, which will make Time of VirtualSegment incorrect (with endTime set to NOW always).

Propose to automaticaly udpate EndTime only when it is not set, allowing to create Virtual segments for scenarios like SNS-SQS VirtualSegment fanout, etc.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@bhautikpip
Copy link
Contributor

I feel explicitly setting end time when we are closing the segment would be more appropriate and avoid cases of stale end time values. However, I am not able to understand why you can not use the now time set by the SDK even in SNS -> SQS scenario? Do you have an example which I can reproduce? I feel the condition of only using now time when end time is zero is a little bit hacky and does not display correct behavior of setting the end time.

@tsypuk
Copy link
Author

tsypuk commented Aug 10, 2021

@bhautikpip To reproduce please try to consume any message from SQS or SNS and you will see that they are not traced in XRAY AWS Console at all. You will not see SQS or SNS node, you will see node named "Client".

There is workaround manually read infromation about traceID from the header, create artificial segment and map its parentID to traceID from header. I have described it in details in upper message.

AWS still can not support rendering of traces lambda consumer side from SNS and SQS.

But to make it work properly you need to update Lambda runtime codebase (Lambda right now creates new Segment and does not read metadata from header of XRAY, that is why parent SegmentID is always lost.)

Unfortunately Lambda runtime repo is not public available. So the only possibiity for end-use of AWS is manually create Segment in Lambda, read traceID from header and map it as parent. Following this approach we can also render segments for SQS and SNS, but if we will use endTime = NOW we will loose the timing informations - SQS or SNS node will represent not actual timing of message, but time when lambda was executed. By the way if you check the headers of SNS message or SQS event that are sent from Lambda that has xray, they contain not only XRAY header but all needed details about timing of message in every messaging system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants