[Feature] Approximate count distinct #1682

Duncannn · 2025-02-28T12:23:01Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing metricflow functionality, rather than a Big Idea better suited to a discussion

Describe the feature

There are cases where a count distinct is too computationally heavy, resulting in slow queries and high costs. Approximate counts can be much faster while still being accurate. Example measure:

name: unique_events
description: Unique Events
expr: event_id
agg: count_distinct
agg_params:
  use_approximate: True

Describe alternatives you've considered

Plain count distincts aren't an option due to performance. When using semantic models built on top of pre aggregated models, we will lose too much flexibility.

Who will this benefit?

No response

Are you interested in contributing this feature?

Yes

Anything else?

Approx percentiles already exists in metricflow:
#395

Similar measure in Looker:

measure: apx_unique_count {
  type: count_distinct
  approximate: yes   # default value is no
  sql: ${id} ;;
}

Duncannn added enhancement New feature or request triage Tasks that need to be triaged labels Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Approximate count distinct #1682

[Feature] Approximate count distinct #1682

Duncannn commented Feb 28, 2025

[Feature] Approximate count distinct #1682

[Feature] Approximate count distinct #1682

Comments

Duncannn commented Feb 28, 2025

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?