[hotswap] Fixes to use projects/hotswap libhsa_hotswap.so and resolve header paths#7629
Merged
Merged
Conversation
The ROCr v1 tool-load carve-out (runtime.cpp) and the CLR hotswap enable-gate (clr/rocclr/device/hotswap.hpp) matched the old comgr-hosted tool name "libamd_comgr_hotswap_tool.so". After the HSA tool moved to rocm-systems' libhsa-hotswap.so, the name no longer matches: on a rocprofiler-register (v3) box allow_v1_registration stays false, so ROCr never invokes the tool's OnLoad and no rewrite is applied (silent). Re-key both constants to "libhsa-hotswap.so".
Installed comgr ships its public header at include/amd_comgr/amd_comgr.h and its CMake package exposes the include root (-Iinclude), so the flat #include <amd_comgr.h> does not resolve when building against an installed comgr (the path TheRock builds the tool against). Use "amd_comgr/amd_comgr.h" to match every other comgr consumer in rocm-systems (clr comgrctx.hpp/devkernel.hpp/devprogram.hpp, rocdbgapi, rocprofiler-sdk).
Fold in the install rule and HSA runtime linkage previously provided by ROCm#7577 so this branch no longer depends on it: add GNUInstallDirs, find_package(hsa-runtime64), link hsa-runtime64::hsa-runtime64, and install(TARGETS hsa-hotswap).
b27da72 to
8fa66e4
Compare
harsh-amd
approved these changes
Jun 23, 2026
b011b8f to
f4571ca
Compare
harsh-amd
reviewed
Jun 23, 2026
harsh-amd
reviewed
Jun 23, 2026
1406d54 to
e6b00f2
Compare
The hand-rolled metadata-note parser (find_isa_in_metadata) over-read the msgpack-encoded ISA string, spilling past the gfx target into the next note key, so COMGR rejected the malformed source ISA (rc=2) and every rewrite fell back to the original code object. Read the source ISA from the code object via COMGR's amd_comgr_get_data_isa_name instead. It uses LLVM's canonical parser, so it is correct and tracks triple normalization automatically, and it removes the fragile hand-rolled ELF/note parsing. The target ISA stays the running GPU's ISA (from the HSA agent), so amd_comgr_hotswap_rewrite still receives source (code object) and target (GPU) as designed. Also drops the now-dead note-parsing helpers and updates the unit test.
harsh-amd
reviewed
Jun 23, 2026
harsh-amd
reviewed
Jun 23, 2026
harsh-amd
reviewed
Jun 23, 2026
Move code-object ISA discovery into GetCodeObjectIsaName (still via amd_comgr_get_data_isa_name) and have the HSA tool derive source_isa from the code object and target_isa from the agent, passing both into RetargetCodeObject. This keeps RetargetCodeObject a pure rewrite primitive whose source/target can be overridden (e.g. cross-gen).
harsh-amd
reviewed
Jun 23, 2026
Add a minimal gfx1250 fixture (tests/fixtures/gfx1250_min.hsaco), embedded into the test binary at build time, plus tests exercising GetCodeObjectIsaName and RetargetCodeObject end-to-end. GPU-free and file-free, so it runs on any CI runner.
f524b17 to
a13f50e
Compare
dayatsin-amd
approved these changes
Jun 23, 2026
This was referenced Jun 24, 2026
1 task
OnUnload is driven by libamdhip64's atexit handler (hsa_shut_down -> UnloadTools), which runs AFTER this library's C++ static destructors because the tool is dlopen'd by hsa_init (its __cxa_atexit dtors are LIFO-earlier). With ordinary file-scope statics, the reader-map / rewritten-ELF containers and their mutexes were destroyed first, then OnUnload's clear()/lock touched freed memory -> SIGSEGV during process exit (confirmed on gfx1250 A0: ~unique_ptr calling a tcache-corrupted deleter). Make these four objects never-destroyed heap singletons so their lifetime spans past static destruction. OnUnload still clear()s the containers and frees every malloc'd ELF buffer; only the small container shells persist to exit. Not a payload leak.
Add a runtime-gated HOTSWAP_LOG(...) macro (driven by HSA_HOTSWAP_VERBOSE, off by default) that traces each intercepted reader-create, load_agent_code_object (with has_bytes), the gate decision, and each rewrite (src/tgt ISA, rc, in/out size, changed). No rebuild needed to toggle; when off the cost is one cached bool load per site. Used to confirm on real gfx1250 A0 that torch/hipBLASLt kernels are intercepted and rewritten.
…etons) The OnUnload use-after-free is better avoided by NOT touching heap-owning state at teardown than by making that state immortal. OnUnload now only restores the HSA API-table function pointers (POD writes); it no longer locks the mutexes or clear()s g_reader_map / g_rewritten_elfs. Those revert to ordinary file-scope statics, freed once by their normal static destructors at process exit. Mirrors the original comgr hotswap tool (llvm-project ROCm#2936), whose OnUnload likewise only restored the table and never crashed at teardown. Safe because ROCr dereferences the retained ELF bytes during the executable's life (debugger/profiler queries), not during teardown. Supersedes the immortal-singleton approach from 56ed59c.
gandryey
approved these changes
Jun 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
This PR contains two components. One component is to update to use the tooling bundled in projects/hotswap instead of the tool that exists at comgr (deprecated). The other component is to resolve amd_comgr header issues during the build. Additionally, there are some changes to the CMakeLists.txt to have the hotswap HSA tool against hsa-runtime64::hsa-runtime64.
Technical Details
Builds on top of Harsh's PRs here: #7577(deprecated, #7577 code is directly in this)Relevant PRs: amd-llvm: ROCm/llvm-project#3007
JIRA ID
N/A
Test Plan
Contained build for the tool (verifies if linking resolves)
Tool name testing must be done with full build
Test Result
Ran:
ls -l /tmp/build-hotswap/libhsa-hotswap.so
nm -D /tmp/build-hotswap/libhsa-hotswap.so | grep hotswap_rewrite
Submission Checklist