Skip to content

UPSTREAM PR #19433: Add a build target to generate ROCm artifacts using ROCm 7.2#1160

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19433-superm1-rocm-github-action
Open

UPSTREAM PR #19433: Add a build target to generate ROCm artifacts using ROCm 7.2#1160
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19433-superm1-rocm-github-action

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Feb 9, 2026

Note

Source pull request: ggml-org/llama.cpp#19433

This builds the following targets:

  • gfx1151
  • gfx1150
  • gfx1200
  • gfx1201
  • gfx1100
  • gfx1101
  • gfx1102
  • gfx1030
  • gfx1031
  • gfx1032

Notably; it doesn't include any CDNA targets, and thus avoids ggml-org/llama.cpp#19269

@loci-review
Copy link

loci-review bot commented Feb 9, 2026

No meaningful performance changes were detected across 115630 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.libggml-base.so, build.bin.llama-bench, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@noamshiovits-Aurora
Copy link

@loci-dev how many functions were modified in this pr and how many files were edited or added ?

@loci-review
Copy link

loci-review bot commented Feb 9, 2026

Based on the repository analysis:

PR Statistics

Files Changed

  • 110 files modified
  • 6 files added
  • 2 files deleted
  • Total: 118 files changed
  • 43 commits

Functions Analyzed

According to the performance review, 115,630 functions were analyzed across all binaries, with no meaningful performance changes detected in any of them.

This is a substantial PR focused on:

  • GPU backend optimizations (Metal, Vulkan, CUDA, SYCL, WebGPU)
  • New model architecture support (Step3.5-Flash, Kimi-Linear, Qwen3next)
  • Core infrastructure refactoring (sampling, batch processing)
  • Testing improvements and dependency updates

The fact that 115K+ functions were analyzed but showed no meaningful performance regressions is actually a positive signal - it indicates that despite the extensive changes across 118 files, the modifications maintained performance stability. The changes appear to be primarily focused on correctness, feature additions, and code quality improvements rather than introducing performance-impacting modifications.

@loci-dev loci-dev force-pushed the main branch 3 times, most recently from ef7afbe to d4c3480 Compare February 14, 2026 02:16
@loci-dev loci-dev force-pushed the loci/pr-19433-superm1-rocm-github-action branch from 8aeb553 to d961293 Compare February 14, 2026 02:16
@loci-review
Copy link

loci-review bot commented Feb 14, 2026

No meaningful performance changes were detected across 115001 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.libggml-cpu.so, build.bin.libmtmd.so, build.bin.libllama.so, build.bin.libggml-base.so, build.bin.llama-bench, build.bin.llama-quantize, build.bin.llama-tokenize, build.bin.llama-gguf-split, build.bin.libggml.so, build.bin.llama-qwen2vl-cli, build.bin.llama-gemma3-cli, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 6 times, most recently from 073bd79 to 823244c Compare February 18, 2026 02:17
This builds the following targets:
 * gfx1151
 * gfx1150
 * gfx1200
 * gfx1201
 * gfx1100
 * gfx1101
 * gfx908
 * gfx90a
 * gfx942
@loci-dev loci-dev force-pushed the loci/pr-19433-superm1-rocm-github-action branch from d961293 to 4a1f236 Compare February 19, 2026 03:06
@loci-review
Copy link

loci-review bot commented Feb 19, 2026

No meaningful performance changes were detected across 111508 analyzed functions in the following binaries: build.bin.llama-cvector-generator, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-tts, build.bin.llama-tokenize, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-bench, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 2 times, most recently from 10f8f26 to a6ecec6 Compare February 20, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments