Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServiceSelection: add support for aggregated metric #713

Merged
merged 118 commits into from
Sep 19, 2024

Conversation

gtk-grafana
Copy link
Contributor

@gtk-grafana gtk-grafana commented Aug 21, 2024

Fixes: #711

Provides experimental support for aggregated metrics within explore logs with the exploreLogsAggregatedMetrics feature flag in Grafana.

Aggregated metrics trade off resolution for much more performant queries.
Using the gdev-loki docker image in this repo:

  • For sum of durations of top 8 queries in the service selection:
    • without aggregated metrics: ~1000ms
    • with aggregated metrics: ~130ms

Using the Grafana dev cluster on a 5 minute query (sum of duration of top 8 queries in service selection):

  • without aggregated metrics: ~5400ms
  • with aggregated metrics: ~600ms

For a 12hr query (sum of duration of top 2 queries in service selection, each broken up into 6 2hr chunks):

  • without aggregated metrics: ~47s
    • Slowest query was over 9s
  • with aggregated metrics: ~450ms
    • slowest query was under 80ms

Here's it in action on a 24hr query:
https://github.com/user-attachments/assets/a3e24f6e-268a-45ea-aa91-2e805d95f0ab

Notes for reviewer:

  • Added new query options to controls that is only visible on the service selection scene, allows users to toggle the state of aggregated metrics.
  • e2e tests are currently passing with and without the ff set

gtk-grafana and others added 30 commits August 6, 2024 11:11
Copy link
Contributor

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! curious what the labels call is for now that we've gone back to the regular /volume call using service_name=~".+"? I'm fine leaving the resource type and logic to run a labels query (as we need it for __stream_shard__ and I'm sure for other things in the future), but I don't think we need to add the state to the ServiceSelection scene unless I'm missing something?

package.json Show resolved Hide resolved
const aggregatedMetricsEnabled: boolean | undefined = config.featureToggles.exploreLogsAggregatedMetrics;
// Don't export AGGREGATED_SERVICE_NAME, we want to rename things so the rest of the application is agnostic to how we got the services
const AGGREGATED_SERVICE_NAME = '__aggregated_metric__';
const AGGREGATED_METRIC_START_DATE = dateTime('2024-08-30', 'YYYY-MM-DD');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually we may want this to be configurable (ie. OSS users), but I'm totally fine with this for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, @svennergr is there a way we can configure this in deployment tools with an env variable or something so OSS users can set their own date?

@gtk-grafana gtk-grafana marked this pull request as ready for review September 17, 2024 17:50
src/Components/IndexScene/ToolbarScene.tsx Outdated Show resolved Hide resolved
src/Components/IndexScene/ToolbarScene.tsx Outdated Show resolved Hide resolved
tests/exploreServices.spec.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@matyax matyax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

@gtk-grafana gtk-grafana enabled auto-merge (squash) September 19, 2024 13:04
@gtk-grafana gtk-grafana merged commit 7fc099d into main Sep 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ServiceSelection: use __aggregated_metric__
4 participants