Skip to content

[Issue]: Is maxBlocksPerMultiProcessor value wrong on MI210/MI250? #121

@fxmarty-amd

Description

@fxmarty-amd

Problem Description

Hi,

To reproduce, run:

#include <stdio.h>
#include <hip/hip_runtime.h>
#include <iostream>

#define HIP_WARN(XXX) \
    do { if (XXX != hipSuccess) std::cerr << "HIP Error: " << \
    hipGetErrorString(XXX) << ", at line " << __LINE__ \
    << std::endl; hipDeviceSynchronize(); } while (0)

int main() {
    int devCount;

    HIP_WARN(hipGetDeviceCount(&devCount));

    std::cout << "Number of devices: " << devCount << "\n";

    int block_per_sm;
    int thread_per_sm;
    HIP_WARN(hipDeviceGetAttribute(&block_per_sm, hipDeviceAttributeMaxBlocksPerMultiProcessor, 0));
    HIP_WARN(hipDeviceGetAttribute(&thread_per_sm, hipDeviceAttributeMaxThreadsPerMultiProcessor, 0));
    
    std::cout << "Max blocks per CU: " << block_per_sm << "\n";
    std::cout << "Max threads per CU: " << thread_per_sm  << "\n";
}

hipDeviceAttributeMaxBlocksPerMultiProcessor gives 2, but trying to estimate in a kernel the maximum number of active workgroups (see https://gist.github.com/Snektron/1fb62a39ee0d7b572c3441f0a53d310c), it seems clear that for workgroup size smaller than 1024 (say with workgroup sizes 64, 128, 256, 512), the number of workgroups scheduled per CU may be higher than 2.

The computation deviceProps.maxBlocksPerMultiProcessor = int(info.maxThreadsPerCU_ / info.maxWorkGroupSize_); in https://github.com/ROCm/clr/blob/b8ba4ccf9c53f6558a5e369e3c1c05de97a0c28f/hipamd/src/hip_device.cpp#L496C77-L496C94 seems wrong.

What do you think?

Operating System

Ubuntu 24.04 LTS (Noble Numbat

CPU

AMD EPYC 73F3 16-Core Processor

GPU

AMD Instinct MI210

ROCm Version

ROCm 6.2.4

ROCm Component

HIP

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions