-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUM-8042 Replace Upload Quality with Batch Blocked telemetry #2230
base: develop
Are you sure you want to change the base?
Conversation
229b5b9
to
09a608a
Compare
Datadog ReportBranch report: ✅ 0 Failed, 1033 Passed, 2853 Skipped, 1m 16.08s Total duration (1m 31.33s time saved) |
09a608a
to
f51cb4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏅
/// Each track reports its own upload quality metrics. | ||
let uploadQuality: [String: UploadQuality] | ||
/// Information about the upload cycles during the session. | ||
/// The upload cycles is splitting between upload track name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\nit /// The upload cycles are splitting between upload track name.
5204914
aa1b8d9
to
31ae520
Compare
static let sampleRate: Float = 1.5 // 1.5% | ||
/// The key for uploader's current delay. | ||
static let uploaderDelayKey = "uploader_delay.current" | ||
/// The key for count of bacthes being blocked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The key for the number of batches being blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of small questions, otherwise LGMT!
DD.logger.debug("⏳ (\(self.featureName)) Uploading batches...") | ||
self.backgroundTaskCoordinator?.beginBackgroundTask() | ||
self.uploadFile(from: files.reversed(), context: context) | ||
sendUploadCycleMetric() | ||
} else { | ||
let batchLabel = files?.isEmpty == false ? "YES" : (isSystemReady ? "NO" : "NOT CHECKED") | ||
let batchLabel = files.isEmpty ? "NO" : "YES" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine the answer is no, but does not sending "NOT CHECKED" anymore have any consequences on our metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it doesn't, it was just for console printing. The drawback of this change is that we will check for batches even if the upload is blocked, but the impact is minimal as we only list the files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read this correctly, we use to send sendUploadQualityMetric
both in cases of success and failure. Now we send sendUploadCycleMetric
before every upload, and track failures with sendBatchBlockedMetric
. So we don't explicitly send a telemetry in case of success. I imagine we can still graph the success rate, but is this intentional?
let files = fileReader.readFiles(limit: maxBatchesPerUpload) | ||
|
||
if !files.isEmpty && isSystemReady { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to asses if this change might have a negative performance impact ?
I don't think it would because contentsOfDirectory
(urderlying Filesystem method used) only get URLs, and not the content of each file, but maybe it's worth checking using maxBatchesPerUpload
files everytime ?
Since we only use files
to get their count for the telemetry, it would be bad if this also had a performance impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The files
with their InputStream
will be needed for the self.uploadFile(from: files.reversed(), context: context)
31ae520
to
564152b
Compare
What and why?
Specifications of
Upload Quality
attributes inSession Ended
metric telemetry are not suitable for graphing.Instead, I propose adding a new Batch metric event to report batches blocked during upload. Currently, we have the
Batch Deleted
telemetry metric, from which we can compute the count of batches deleted due tointake-code-202
(successful upload). We can add aBatch Blocked
metric to report when a batch is blocked due to local blockers (low power, offline, etc.) or network failures. This will provide visibility into why batches are not uploaded and allow us to compute the ratio of successful uploads versus failures. This new telemetry event should be sampled at the same rate asBatch Deleted
: 1.5%.The complex
rse.upload_quality
attribute can then be replaced by a simplerse.upload_cycle.<logs|rum|*>
that track upload cycle per track. In this new definition, an Upload Cycle will include multiple batch upload.How?
Similar to
BatchDeletedMetric
, theBatchBlockedMetric
will report reasons for blocking batch upload, or failures to upload (network or http)Review checklist
make api-surface
)