Allow building with HIP devices #118

JaeseungYeom · 2025-04-08T03:33:42Z

This PR allows to build ExaEpi on HIP device enabled platforms.

For example, to build on tuolumne.llnl.gov, that has four AMD MI300A and 4th gen EPYC CPU per compute node, do as follows:

Load necessary modules including rocm, mpi and c++ compiler: module load PrgEnv-gnu-amd/8.6.0 rocm/6.3.1hangfix cray-mpich/8.1.32
Change directory in to build directory: mkdir build; cd build
Find out the device id: rocminfo | awk '((NF==2) && ($1=="Name:") && ($2 ~ /^gfx/)) {print $2}' | uniq
Set the environment variable for the HIP device id: export AMD_ARCH=gfx942
Run cmake: cmake -DAMReX_GPU_BACKEND=HIP -DAMReX_AMD_ARCH=${AMD_ARCH} -DCMAKE_INSTALL_PREFIX=`realpath ..`/install ..
Compile: make -j 16

debog · 2025-04-08T11:26:21Z

src/Utils.cpp

        params.ic_type = ICType::UrbanPop;
        pp.get("urbanpop_filename", params.urbanpop_filename);
-#ifdef AMREX_USE_CUDA
+#if defined(AMREX_USE_CUDA) || defined(AMREX_USE_HIP)


@JaeseungYeom Can you try #if defined(AMREX_USE_GPU) instead of #if defined(AMREX_USE_CUDA) || defined(AMREX_USE_HIP) here?

I think @stevenhofmeyr set box size to 500 for nvidia gpu only, hence the use of AMREX_USE_CUDA. If we change this to AMREX_USE_GPU, this value will apply to all other GPUs including Intel and AMD ones.

@tannguyen153 I think that's how it should be. When using GPUs, whether it's AMD/Intel/NVIDIA, the box size should be larger to minimize MPI communications between boxes on the same GPU, right?

Right I think box size should be large enough for all GPUs activated by AMREX_USE_GPU. We can also tune the box size for specific GPUs and enumerate the initial values with AMREX_USE_CUDA, AMREX_USE_HIP and AMREX_USE_SYCL, etc.

I am currently testing with that value. I am getting oom error with the value set to 100. I will experiment with it, and let you know.

[yeom2@tuolumne1038:bin]$ srun -N 4 -n 16 --exclusive ./agent inputs.ca Initializing AMReX (25.04-9-g30a9768150c4)... MPI initialized with 16 MPI processes MPI initialized with thread support level 0 Initializing HIP... HIP initialized with 16 devices. 2130.442s: flux-shell[1]: ERROR: oom: Memory cgroup out of memory: killed 1 task on tuolumne1042. 2130.442s: flux-shell[1]: ERROR: oom: memory.peak = 240.32831G

@JaeseungYeom : I just checked the code again - this is a runtime parameter agent.max_box_size

@tannguyen153 : this is specifically for the UrbanPop code, and defaults to 100 for CPUs and 500 for GPUs. For the census code, it defaults to 16. It's so much more for UrbanPop because the underlying grid is lat/lng, and so we have many grid points that have no communities, unlike the packed allocation for the census where the underlying grid does not relate to physical lat/lng.

@JaeseungYeom : you're likely getting oom for smaller box sizes because you'll have too many boxes (most of which will be empty).

I confirm that I can avoid OOM error using 8 nodes instead of 4 nodes. It looks like the memory is limited on tuolumne as it is shared between GPU and CPU. So, what is the final suggestion? Just remove the HIP flag or separate it from CUDA? Do you want me to try UrbanPop data and play with agent.max_box_size?

You could experiment with agent.max_box_size to find the best settings. Ideally the code in Utils.cpp will set usable defaults for every common situation.
The defaults should also be described in examples/inputs.defaults, e.g.:

# if ic_type is census # agent.max_box_size = 16 # if ic_type is urbanpop and using GPUs # agent.max_box_size = 500 # if ic_type is urbanpop and not using GPUs # agent.max_box_size = 100

We should add extra info there for any new default cases. These parameters are also described in the docs.

CMakeLists.txt

…e, in case of c+117

Allow building with HIP devices

45f1792

debog requested review from atmyers and tannguyen153 April 8, 2025 11:22

debog reviewed Apr 8, 2025

View reviewed changes

CMakeLists.txt Show resolved Hide resolved

Add (void) to the return of hipMemset(), which has nodiscard attribut…

29b8bfb

…e, in case of c+117

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow building with HIP devices #118

Allow building with HIP devices #118

Uh oh!

JaeseungYeom commented Apr 8, 2025 •

edited

Loading

Uh oh!

debog Apr 8, 2025

Uh oh!

tannguyen153 Apr 8, 2025

Uh oh!

debog Apr 8, 2025

Uh oh!

tannguyen153 Apr 8, 2025

Uh oh!

JaeseungYeom Apr 8, 2025

Uh oh!

stevenhofmeyr Apr 8, 2025

Uh oh!

stevenhofmeyr Apr 8, 2025

Uh oh!

stevenhofmeyr Apr 8, 2025

Uh oh!

JaeseungYeom Apr 8, 2025

Uh oh!

stevenhofmeyr Apr 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Allow building with HIP devices #118

Are you sure you want to change the base?

Allow building with HIP devices #118

Uh oh!

Conversation

JaeseungYeom commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JaeseungYeom commented Apr 8, 2025 •

edited

Loading