eBPF-based format profiling for Vortex #8337
m7kss1
started this conversation in
Infrastructure
Replies: 3 comments
-
|
That sounds like a great idea. The observability story in Vortex is currently somewhat neglected, but this might be a good feature to motivate us to make it better. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
actually, i've already vibecoded some PoC implementation for datafusion engine using aya note: the numbers below are illustrative and was collected on debug build: $ sudo -E vx profile query lineitem_7.vortex --sql "select sum(l_extendedprice) from data where l_shipdate < date '1995-01-01'"
...
vx profile: attached 12 markers to pid 3060932
{
"engine": "datafusion",
"metrics": {
"cold.major_faults": 0,
"cold.page_cache_hit_ratio": 1.0,
"cold.storage_read_bytes": 0,
"decode.total_calls": 91,
"decode.total_ms": 1088.0898,
"decode.vortex.binary.bytes": 0,
"decode.vortex.binary.calls": 2,
"decode.vortex.binary.mb_per_sec": 0.0,
"decode.vortex.binary.ms": 0.0966,
"decode.vortex.binary.rows": 2,
"decode.vortex.binary.rows_per_sec": 20705.8629,
"decode.vortex.constant.bytes": 74,
"decode.vortex.constant.calls": 19,
"decode.vortex.constant.mb_per_sec": 0.2018,
"decode.vortex.constant.ms": 0.3667,
"decode.vortex.constant.rows": 19,
"decode.vortex.constant.rows_per_sec": 51818.5587,
"decode.vortex.filter.bytes": 0,
"decode.vortex.filter.calls": 34,
"decode.vortex.filter.mb_per_sec": 0.0,
"decode.vortex.filter.ms": 93.7518,
"decode.vortex.filter.rows": 1354792,
"decode.vortex.filter.rows_per_sec": 14450839.5227,
"decode.vortex.pco.bytes": 11003328,
"decode.vortex.pco.calls": 35,
"decode.vortex.pco.mb_per_sec": 11.0714,
"decode.vortex.pco.ms": 993.8516,
"decode.vortex.pco.rows": 3152593,
"decode.vortex.pco.rows_per_sec": 3172096.4255,
"filter.conjunct.0.ms": 1057.7027,
"filter.conjunct.0.rows_in": 1576200,
"filter.conjunct.0.rows_kept": 677396,
"filter.conjunct.0.selectivity": 0.4298,
"filter.rows_in": 1576200,
"filter.rows_kept": 677396,
"filter.selectivity": 0.4298,
"io.coalescing_factor_avg": 4.2,
"io.logical_segment_bytes": 6412096,
"io.physical_read_bytes": 6722036,
"io.physical_reads": 5,
"io.read_amplification": 1.0483,
"io.read_size_p50_bytes": 64.0,
"io.read_size_p99_bytes": 2097152.0,
"io.read_syscall_bytes": 6796154,
"io.read_syscalls": 59,
"io.segment_requests": 21,
"io.segment_size_p50_bytes": 262144.0,
"io.segment_size_p99_bytes": 262144.0,
"io.tiny_reads": 54,
"memory.rss_peak_bytes": 1428934656,
"memory.rss_peak_delta_bytes": 1363107840,
"metadata.footer_bytes": 65535,
"metadata.footer_reads": 1,
"pruning.conjunct.0.ms": 36.4686,
"pruning.conjunct.0.rows_in": 1510664,
"pruning.conjunct.0.rows_kept": 1510664,
"pruning.pruned_ratio": 0.0,
"pruning.rows_in": 1576200,
"pruning.rows_kept": 1576200,
"pushdown_fallback.total": 17,
"pushdown_fallback.vortex.filter=>vortex.pco.count": 17,
"scan.count": 1,
"scan.rows_out": 677396,
"scan.split_duration_p50_ms": 67.1089,
"scan.split_duration_p99_ms": 67.1089,
"scan.split_peak_concurrent": 17,
"scan.splits": 3,
"scan.splits_pruned": 0
},
"query": "select sum(l_extendedprice) from data where l_shipdate < date '1995-01-01'",
"target": "lineitem_7.vortex",
"wall_ms": 249.2569
} |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
That looks really promising, do you have a branch you can share? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
the idea is to expose a small number of semantic instrumentation points in vortex: scan boundaries, segment requests, coalesced reads, pruning/filter results, decode spans, and canonicalization fallbacks. eBPF programs can attach to these points with uprobes, aggregate counters in kernel maps, and combine them with kernel signals such as read syscalls, read-size histograms, and optional PMU conunters
note: this is not meant to replace perf/flamegraphs etc. those tools are good at showing where cpu time is spent. they do not directly explain format-level behavior: whether a layout caused more physical reads, whether pruning became less effective, which encoding dominated decode cost etc
the kind of statistics this could collect:
the main advantage over manual instrumentation is that most aggregation stays outside the hot path. vortex only exposes stable semantic events; eBPF collects and aggregates them without threading metric state through every reader, executor, and engine integration. but the most important is that this approach makes the same counters usable across different workloads/query engines that vortex integrated with
my view is that this would complement existing benchmarks and profilers: benchmarks tell us that something changed, flamegraphs tell us where cpu went, but format profiling could explain what changed in the format behaviour
Beta Was this translation helpful? Give feedback.
All reactions