Skip to content

Instrumentation Modes

FindHao edited this page Aug 31, 2025 · 5 revisions

opcode_only (lightweight) 🪶

  • Emits: opcode ID, warp ID, PC, kernel_launch_id, CTA IDs
  • Use for: Proton instruction histogram; lowest overhead
  • May be auto-enabled by analyses (see below)

reg_trace (medium) 🔬

  • Emits: per-thread register values (plus unified registers), opcode ID, PC
  • Use for: register value tracing and dataflow inspection

mem_trace (heavy) 🧠

  • Emits: 32-lane memory addresses for memory-reference instructions
  • Use for: memory access pattern analysis

Combining modes ➕

  • CUTRACER_INSTRUMENT accepts comma-separated values. Analyses may also enable required modes implicitly.
  • proton_instr_histogram auto-enables opcode_only.
  • deadlock_detection auto-enables reg_trace.

Notes 📝

  • When an analysis auto-enables a mode, you do not need to repeat it in CUTRACER_INSTRUMENT.
  • Enabling additional modes increases overhead and output volume; prefer the minimal set that satisfies your analysis.

Clone this wiki locally