-
Couldn't load subscription status.
- Fork 837
Apply low selectivity matchers lazily in ingester #7063
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
It looks promising. Can you share some benchmarks? |
| "no optimization needed with .* regex returns all as select": { | ||
| input: []*labels.Matcher{ | ||
| labels.MustNewMatcher(labels.MatchRegexp, "job", ".*"), | ||
| labels.MustNewMatcher(labels.MatchEqual, "instance", "server1"), | ||
| }, | ||
| expectedSelect: []*labels.Matcher{ | ||
| labels.MustNewMatcher(labels.MatchRegexp, "job", ".*"), | ||
| labels.MustNewMatcher(labels.MatchEqual, "instance", "server1"), | ||
| }, | ||
| expectedLazy: nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, do we need to keep the .*?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep .* as PostingsForMatcher will anyway ignore it.
|
@SungJin1212 Let me add a benchmark test and share the results |
|
Benchmark result. For each test case I ran it with and without the optimization. For the sparse label test case, performance becomes worse but that's expected and rare. For most of the cases if there is no empty value then this optimizations helps. |
|
@yeya24 |
Signed-off-by: yeya24 <[email protected]>
Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: yeya24 <[email protected]>
Signed-off-by: yeya24 <[email protected]>
Signed-off-by: yeya24 <[email protected]>
68e6e11 to
a564dd2
Compare
What this PR does:
This PR adds similar functionality of lazy postings from Store Gateway but to Ingester.
We have found that in Ingester, low selectivity matcher like
=~".+"most of the case cannot filter out anything but can be extremely expensive as it requires fetching postings for all values for the label and intersect with other postings.Here is one example CPU profile where we spend 50% of CPU on
PostingsForAllLabelValueswhich is used to evaluate=~.+matcher.Because of the low selectivity of such matcher, it seems cheaper to apply the label matcher lazily on the selected series set instead. This PR adds a new config
enable_matcher_optimizationto ingester to enable this matcher optimization.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]