-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the usec_per_call operation from float to long long division #1888
base: unstable
Are you sure you want to change the base?
Conversation
The usec_per_call is a simple AVG metric that divides the execution time by the number of calls. However, it was performing the calculation by converting long long to float. For two reasons, usec_per_call does not require the high accuracy of a float level: 1. There are not many reasons to know the average value in units smaller than microseconds. 2. It currently only displays the value of two digits after the decimal point for accuracy Therefore, I propose changing the implementation to a simple long long division operation without the floating-point conversion and computation. This change will benefit users who are using various commands in valkey and using high-resolution metrics in commandstats. Signed-off-by: Jungwoo Song <[email protected]>
2f9cb74
to
017fbe6
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #1888 +/- ##
============================================
- Coverage 71.06% 71.02% -0.04%
============================================
Files 123 123
Lines 65671 65671
============================================
- Hits 46669 46646 -23
- Misses 19002 19025 +23
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention in the PR title or top comment which command is affected by this change.
A possible workaround: It's possible call INFO with a specific section (e.g. INFO server
), to avoid computing this info section.
@@ -5490,10 +5490,10 @@ sds genValkeyInfoStringCommandStats(sds info, hashtable *commands) { | |||
char *tmpsafe; | |||
if (c->calls || c->failed_calls || c->rejected_calls) { | |||
info = sdscatprintf(info, | |||
"cmdstat_%s:calls=%lld,usec=%lld,usec_per_call=%.2f" | |||
"cmdstat_%s:calls=%lld,usec=%lld,usec_per_call=%lld" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will change the format slightly from 1.00
to 1
.
I think it's right to remove the sub-microsecond part. We only measure the execution time in microseconds anyway, so this is already not exact on sub-microsecond level.
We should probably label it as a potentially breaking change though, just in some case some tool is parsing this and expect the period to be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sub-microsecond latency is useful - why not modify getMonotonicUs
to return nanoseconds instead (getMonotonicNs)? both __rdtsc/__cntvct can provide sub-microsecond precision...
I have some concerns about this change. It breaks existing behavior, commands that take <1µs will now just show 0, which looks off and we lose useful precision, especially when profiling fast commands. I still don’t quite understand why this change is needed, could you provide an example where the current float-based calculation causes issues? UPDATE: Saw that the precision in µs, this means the reported usec_per_call is not accurate. So we either fix the accuracy or we remove these decimal digits (which is what this change is doing)... |
Affected command by this change
INFO commandstats
Changed results example
cmdstat_set:calls=2,usec=36,usec_per_call=18,rejected_calls=1,failed_calls=0 cmdstat_get:calls=3,usec=13,usec_per_call=4,rejected_calls=1,failed_calls=0 cmdstat_info:calls=6,usec=444,usec_per_call=74,rejected_calls=0,failed_calls=0 cmdstat_command|docs:calls=2,usec=3951,usec_per_call=1975,rejected_calls=0,failed_calls=0
Context
The usec_per_call is a simple AVG metric that divides the execution time by the number of calls. However, it was performing the calculation by converting long long to float. For two reasons, usec_per_call does not require the high accuracy of a float level:
Therefore, I propose changing the implementation to a simple long long division operation without the floating-point conversion and computation.
This change will benefit users who are using various commands in valkey and using high-resolution metrics in commandstats.