update approx_topk page and create /sql-functions folder #91

DebashisBorgohainO2 · 2025-07-11T11:39:22Z

No description provided.

…dividual function pages under /sql-functions folder

hengfeiyang · 2025-07-11T13:59:14Z

docs/sql-functions/aggregate.md

+### `histogram`
+**Syntax**: `histogram(field, 'duration')`
+**Description:** <br>
+Use the `histogram` function to divide your time-based log data into time buckets of a fixed duration and then apply aggregate functions such as `COUNT()` or `SUM()` to those intervals.
+This helps in visualizing time-series trends and performing meaningful comparisons over time. <br><br>


Histogram also support histgoram(field) without interval, the backend will auto-calculate the interval.

we commonly call it interval not duration.

hengfeiyang · 2025-07-11T14:01:21Z

docs/sql-functions/approximate-aggregate/approx-topk.md

+    Consider the following scenario:
+
+    - Dataset contains `3 million` unique client IPs.
+    - Query runs using `60` CPU cores.


this line should be runs using 60 querier nodes.

and each node have 60 cpu cores. each core will have one partition. so at the end is:

3m * 60(nodes) * 60(cores/partitions)

hengfeiyang · 2025-07-11T14:04:03Z

docs/sql-functions/approximate-aggregate/approx-topk.md

+When you run a query using `approx_topk()`, each query node processes a subset of the dataset and computes its local approximate top K values. These local top K values are sent to the leader node. The leader node merges them to generate the final approximate result. 
+
+Because each node sends only its local top K values, the final result may miss values that are frequent across the entire dataset but do not appear in the top K list of any single node. 


Each node not only send topK to leader, actually it will send
max(k*10, 1000)

it means, when you want to top 10, each node will send 1000 to the leader.

why not only top10?

because maybe some item is not top 10 but in other node is top 10, we need some capacity to fix the final value.

hengfeiyang · 2025-07-11T14:06:30Z

docs/sql-functions/full-text-search.md

+### `str_match`
+
+**Syntax**: `str_match(field, 'value')` <br>
+**Alias**: `match_field(field, 'value')` 


str_match have an alias name match_field, and str_match_ignore_case have an alias match_field_ignore_case

update approx_topk page and restructure SQL References by creating in…

69aeccd

…dividual function pages under /sql-functions folder

DebashisBorgohainO2 requested a review from hengfeiyang July 11, 2025 11:40

hengfeiyang requested changes Jul 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update approx_topk page and create /sql-functions folder #91

update approx_topk page and create /sql-functions folder #91

Uh oh!

DebashisBorgohainO2 commented Jul 11, 2025

Uh oh!

hengfeiyang Jul 11, 2025

Uh oh!

hengfeiyang Jul 11, 2025

Uh oh!

hengfeiyang Jul 11, 2025

Uh oh!

hengfeiyang Jul 11, 2025

Uh oh!

Uh oh!

		When you run a query using `approx_topk()`, each query node processes a subset of the dataset and computes its local approximate top K values. These local top K values are sent to the leader node. The leader node merges them to generate the final approximate result.

		Because each node sends only its local top K values, the final result may miss values that are frequent across the entire dataset but do not appear in the top K list of any single node.

update approx_topk page and create /sql-functions folder #91

Are you sure you want to change the base?

update approx_topk page and create /sql-functions folder #91

Uh oh!

Conversation

DebashisBorgohainO2 commented Jul 11, 2025

Uh oh!

hengfeiyang Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

hengfeiyang Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

hengfeiyang Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

hengfeiyang Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!