Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Approximate count distinct #1682

Open
3 tasks done
Duncannn opened this issue Feb 28, 2025 · 0 comments
Open
3 tasks done

[Feature] Approximate count distinct #1682

Duncannn opened this issue Feb 28, 2025 · 0 comments
Labels
enhancement New feature or request triage Tasks that need to be triaged

Comments

@Duncannn
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing metricflow functionality, rather than a Big Idea better suited to a discussion

Describe the feature

There are cases where a count distinct is too computationally heavy, resulting in slow queries and high costs. Approximate counts can be much faster while still being accurate. Example measure:

name: unique_events
description: Unique Events
expr: event_id
agg: count_distinct
agg_params:
  use_approximate: True

Describe alternatives you've considered

Plain count distincts aren't an option due to performance. When using semantic models built on top of pre aggregated models, we will lose too much flexibility.

Who will this benefit?

No response

Are you interested in contributing this feature?

Yes

Anything else?

Approx percentiles already exists in metricflow:
#395

Similar measure in Looker:

measure: apx_unique_count {
  type: count_distinct
  approximate: yes   # default value is no
  sql: ${id} ;;
}
@Duncannn Duncannn added enhancement New feature or request triage Tasks that need to be triaged labels Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage Tasks that need to be triaged
Projects
None yet
Development

No branches or pull requests

1 participant