Skip to content

[Feature] Add volatile_load primitive for spin-wait patterns #648

@alanray-tech

Description

@alanray-tech

Summary

Quadrants has no volatile load or atomic load primitive. This makes it impossible to correctly and efficiently implement spin-wait patterns (e.g. decoupled-look-back scans) that require repeatedly reading a memory location until it changes.

Problem

A spin-wait loop like:

while flags[prev] == STATE_INVALID:
    pass

relies on the compiler re-reading flags[prev] from global memory on each iteration. Without a volatile load, the compiler may hoist the load out of the loop or cache the value in a register, turning the spin into an infinite loop.

Current workarounds are all suboptimal:

Workaround Correct? Performance
grid.memfence() inside the loop Yes (acts as compiler barrier) Bad — full device-scope cache drain per iteration
atomic_add(flags[prev], 0) Yes (forces memory round-trip) Bad — read-modify-write overhead, contention
Do nothing and hope the compiler doesn't optimize Fragile N/A

Proposed solution

Add a qd.volatile_load(target, *indices) primitive (or equivalent) that guarantees the load goes to memory on every call. The implementation maps cleanly to existing backend primitives:

  • LLVM IR (CUDA / AMDGPU): emit load volatile instead of load — LLVM guarantees it cannot be eliminated, merged, or hoisted. On CUDA, LLVM lowers this to ld.volatile.global in PTX.
  • SPIR-V (Vulkan / Metal): emit OpLoad with the Volatile Memory Access bit (0x1) — prevents expression forwarding and value caching.

No new hardware capability is needed; every backend already supports this at the instruction level.

Use cases

  • Decoupled-look-back scans (Onesweep-style) — spin on a flag array
  • Producer-consumer patterns between blocks via global memory
  • Any kernel that polls a shared memory location written by another thread/block

Found during

Review of #641 (docs for qd.simt.grid.*), where the lookback_scan example relies on re-reading flags[prev] in a spin loop but has no way to guarantee it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions