Skip to content

Latest commit

 

History

History
201 lines (141 loc) · 4.91 KB

File metadata and controls

201 lines (141 loc) · 4.91 KB

Profiling Notes (Samply + DuckDB)

This doc captures what we learned while profiling json_extract_columns so we can repeat it without surprises.

Build (symbols)

Use a symbol-rich build for useful stacks:

make reldebug

make release works, but stacks may show raw addresses if symbols are missing.

CPU sampling with Samply (no local server)

To avoid launching Samply's local UI server, use --save-only.

mkdir -p bench/results/samply
samply record --save-only --output bench/results/samply/<case>.json.gz -- \
  uv run python bench/run_benchmarks.py --filter <case>

Example:

samply record --save-only \
  --output bench/results/samply/json_extract_columns-100k-many_patterns.json.gz -- \
  uv run python bench/run_benchmarks.py --filter json_extract_columns/100k/many_patterns

Notes:

  • --save-only prevents starting the local web server.
  • --no-open only avoids opening the UI; it can still start the server.

Offline symbolization (optional)

If you want symbols available later (even without the original binaries), add:

samply record --save-only --unstable-presymbolicate \
  --output bench/results/samply/<case>.json.gz -- \
  uv run python bench/run_benchmarks.py --filter <case>

This emits a sidecar file next to the profile:

bench/results/samply/<case>.json.syms.json

--unstable-presymbolicate is marked unstable by Samply, but it is useful when you need symbols after moving the profile.

Viewing a saved profile

Start the server without auto-opening a browser:

samply load --no-open bench/results/samply/<case>.json.gz

Then open http://127.0.0.1:3000 manually (or the Firefox Profiler URL printed by samply).

If a .syms.json sidecar exists in the same directory, Samply uses it for symbolization.

Analyzing profiles programmatically

Use bench/analyze_profile.py to extract function timings from profiles. Requires --unstable-presymbolicate when recording to generate the .syms.json sidecar.

Basic usage

python3 bench/analyze_profile.py bench/results/samply/<case>.json.gz

Options

Option Description
--top N Show top N functions (default: 30)
--filter STRING Filter functions containing STRING (case-insensitive)
--thread NAME Analyze specific thread (default: thread with most samples)

Examples

# Basic analysis - shows all threads, then top functions by self/inclusive time
python3 bench/analyze_profile.py bench/results/samply/json_group_merge.json.gz

# Filter for json-related functions only
python3 bench/analyze_profile.py <profile> --filter json --top 20

# Analyze a specific thread (useful when multiple workers)
python3 bench/analyze_profile.py <profile> --thread python3

# Show more results
python3 bench/analyze_profile.py <profile> --top 50

Output format

The script outputs two sections:

Self time: Time spent directly in each function (excluding callees). Useful for finding CPU-intensive functions.

=== Self time (top 30) ===
 30.3%   2335  duckdb::JsonGroupMergeApplyPatchInternal
 26.2%   2015  duckdb::yyjson_mut_obj_iter_next
 13.6%   1046  _platform_memcmp

Inclusive time: Time spent in each function including all callees. Useful for finding hot call paths.

=== Inclusive time (top 30) ===
 79.0%   6078  duckdb::AggregateFunction::UnaryScatterUpdate
 40.8%   3143  duckdb::JsonGroupMergeApplyPatchInternal

Symbol file structure

The .syms.json sidecar (generated by --unstable-presymbolicate):

{
  "string_table": ["symbol1", "symbol2", ...],
  "data": [
    {
      "debug_name": "duckdb",
      "symbol_table": [
        {"rva": 8960, "size": 624, "symbol": 2}
      ]
    }
  ]
}
  • string_table: function names indexed by symbol_table entries
  • data[].debug_name: library name (e.g., "duckdb", "libc")
  • data[].symbol_table: maps RVA ranges to symbol indices
  • Profile's frameTable.address contains RVAs to look up

Troubleshooting

"Error: syms file not found" Re-record with --unstable-presymbolicate:

samply record --save-only --unstable-presymbolicate --output <file>.json.gz -- <cmd>

Functions showing as <frame:N> or fun_XXXXXX Symbols not found. Possible causes:

  • Build without debug symbols (use make reldebug)
  • System libraries without debug packages
  • Binary stripped after recording

Attaching to an existing process

On Linux, you can attach by PID:

samply record -p <pid>

On macOS, attaching to a running process requires:

samply setup

(This codesigns the binary so it can attach.)

DuckDB query profiles (not CPU sampling)

To collect DuckDB's JSON query profile:

uv run python bench/run_benchmarks.py --profile --filter <case>

This writes:

bench/results/profiles/<case>/query_profile.json

Benchmark outputs

run_benchmarks.py always writes timing results to:

bench/results/latest.json