Introduction of CUDA Programmatic Dependent Launch to Llama.cpp #15480

agray3 · 2025-08-21T15:38:20Z

Make sure to read the contributing guidelines before submitting a PR

yeahdongcn · 2025-08-22T06:37:36Z

ggml/src/ggml-cuda/acc.cu

@@ -3,6 +3,9 @@
 static __global__ void acc_f32(const float * x, const float * y, float * dst, const int64_t ne,
        const int64_t ne10, const int64_t ne11, const int64_t ne12, const int64_t ne13,
        const int64_t s11, const int64_t s12, const int64_t s13, const int64_t offset) {
+#if !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER


It might be better to define a dedicated macro and use it wherever needed. For example:

#if !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER #define XXX_AVAILABLE #endif // !defined(GGML_USE_HIP) && __CUDA_ARCH__ >= GGML_CUDA_CC_HOPPER

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp

614dee0

See ggml-org#15479

agray3 mentioned this pull request Aug 21, 2025

NVIDIA Programmatic Dependent Launch for Llama.cpp #15479

Open

4 tasks

yeahdongcn reviewed Aug 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp #15480

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp #15480

agray3 commented Aug 21, 2025

Uh oh!

yeahdongcn Aug 22, 2025

Uh oh!

Uh oh!

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp #15480

Are you sure you want to change the base?

Introduction of CUDA Programmatic Dependent Launch to Llama.cpp #15480

Conversation

agray3 commented Aug 21, 2025

Uh oh!

yeahdongcn Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!