Profiling Notes (Samply + DuckDB)

This doc captures what we learned while profiling json_extract_columns so we can repeat it without surprises.

Build (symbols)

Use a symbol-rich build for useful stacks:

make reldebug

make release works, but stacks may show raw addresses if symbols are missing.

CPU sampling with Samply (no local server)

To avoid launching Samply's local UI server, use --save-only.

mkdir -p bench/results/samply
samply record --save-only --output bench/results/samply/<case>.json.gz -- \
  uv run python bench/run_benchmarks.py --filter <case>

Example:

samply record --save-only \
  --output bench/results/samply/json_extract_columns-100k-many_patterns.json.gz -- \
  uv run python bench/run_benchmarks.py --filter json_extract_columns/100k/many_patterns

Notes:

--save-only prevents starting the local web server.
--no-open only avoids opening the UI; it can still start the server.

Offline symbolization (optional)

If you want symbols available later (even without the original binaries), add:

samply record --save-only --unstable-presymbolicate \
  --output bench/results/samply/<case>.json.gz -- \
  uv run python bench/run_benchmarks.py --filter <case>

This emits a sidecar file next to the profile:

bench/results/samply/<case>.json.syms.json

--unstable-presymbolicate is marked unstable by Samply, but it is useful when you need symbols after moving the profile.

Viewing a saved profile

Start the server without auto-opening a browser:

samply load --no-open bench/results/samply/<case>.json.gz

Then open http://127.0.0.1:3000 manually (or the Firefox Profiler URL printed by samply).

If a .syms.json sidecar exists in the same directory, Samply uses it for symbolization.

Analyzing profiles programmatically

Use bench/analyze_profile.py to extract function timings from profiles. Requires --unstable-presymbolicate when recording to generate the .syms.json sidecar.

Basic usage

python3 bench/analyze_profile.py bench/results/samply/<case>.json.gz

Options

Option	Description
`--top N`	Show top N functions (default: 30)
`--filter STRING`	Filter functions containing STRING (case-insensitive)
`--thread NAME`	Analyze specific thread (default: thread with most samples)

Examples

# Basic analysis - shows all threads, then top functions by self/inclusive time
python3 bench/analyze_profile.py bench/results/samply/json_group_merge.json.gz

# Filter for json-related functions only
python3 bench/analyze_profile.py <profile> --filter json --top 20

# Analyze a specific thread (useful when multiple workers)
python3 bench/analyze_profile.py <profile> --thread python3

# Show more results
python3 bench/analyze_profile.py <profile> --top 50

Output format

The script outputs two sections:

Self time: Time spent directly in each function (excluding callees). Useful for finding CPU-intensive functions.

=== Self time (top 30) ===
 30.3%   2335  duckdb::JsonGroupMergeApplyPatchInternal
 26.2%   2015  duckdb::yyjson_mut_obj_iter_next
 13.6%   1046  _platform_memcmp

Inclusive time: Time spent in each function including all callees. Useful for finding hot call paths.

=== Inclusive time (top 30) ===
 79.0%   6078  duckdb::AggregateFunction::UnaryScatterUpdate
 40.8%   3143  duckdb::JsonGroupMergeApplyPatchInternal

Symbol file structure

The .syms.json sidecar (generated by --unstable-presymbolicate):

{
  "string_table": ["symbol1", "symbol2", ...],
  "data": [
    {
      "debug_name": "duckdb",
      "symbol_table": [
        {"rva": 8960, "size": 624, "symbol": 2}
      ]
    }
  ]
}

string_table: function names indexed by symbol_table entries
data[].debug_name: library name (e.g., "duckdb", "libc")
data[].symbol_table: maps RVA ranges to symbol indices
Profile's frameTable.address contains RVAs to look up

Troubleshooting

"Error: syms file not found" Re-record with --unstable-presymbolicate:

samply record --save-only --unstable-presymbolicate --output <file>.json.gz -- <cmd>

Functions showing as <frame:N> or fun_XXXXXX Symbols not found. Possible causes:

Build without debug symbols (use make reldebug)
System libraries without debug packages
Binary stripped after recording

Attaching to an existing process

On Linux, you can attach by PID:

samply record -p <pid>

On macOS, attaching to a running process requires:

samply setup

(This codesigns the binary so it can attach.)

DuckDB query profiles (not CPU sampling)

To collect DuckDB's JSON query profile:

uv run python bench/run_benchmarks.py --profile --filter <case>

This writes:

bench/results/profiles/<case>/query_profile.json

Benchmark outputs

run_benchmarks.py always writes timing results to:

bench/results/latest.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiling Notes (Samply + DuckDB)

Build (symbols)

CPU sampling with Samply (no local server)

Offline symbolization (optional)

Viewing a saved profile

Analyzing profiles programmatically

Basic usage

Options

Examples

Output format

Symbol file structure

Troubleshooting

Attaching to an existing process

DuckDB query profiles (not CPU sampling)

Benchmark outputs

FilesExpand file tree

PROFILING.md

Latest commit

History

PROFILING.md

File metadata and controls

Profiling Notes (Samply + DuckDB)

Build (symbols)

CPU sampling with Samply (no local server)

Offline symbolization (optional)

Viewing a saved profile

Analyzing profiles programmatically

Basic usage

Options

Examples

Output format

Symbol file structure

Troubleshooting

Attaching to an existing process

DuckDB query profiles (not CPU sampling)

Benchmark outputs