Skip to content

ROCM-26483 - fix cooperative_groups::reduce() would not work with 2D or 3D tiles#7588

Closed
g-h-c wants to merge 3 commits into
developfrom
users/g-h-c/ROCM-26483_cooperative_groups_reduce_mask_fix
Closed

ROCM-26483 - fix cooperative_groups::reduce() would not work with 2D or 3D tiles#7588
g-h-c wants to merge 3 commits into
developfrom
users/g-h-c/ROCM-26483_cooperative_groups_reduce_mask_fix

Conversation

@g-h-c

@g-h-c g-h-c commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Motivation

When using cooperative_groups::reduce(block, value, operator), if the thread block contains y-dimensions or z-dimensions that are different from 1, the operation would produce incorrect results or lead to a crash.

Technical Details

mask <<= (((threadIdx.x % warpSize) / group.num_threads()) * group.num_threads());

should have been

mask <<= (((internal::workgroup::thread_rank() % warpSize) / group.num_threads()) * group.num_threads());

(note that the previous expression cannot be simplified to mask <<= internal::workgroup::thread_rank() % warpSize` because the operator in that expression is an integer division; not a floating point division)

JIRA ID

ROCM-26483

Test Plan

Added test Unit_Thread_Block_Tile_Multi_Dimensional_Reduce

Test Result

The test fails on current develop, but passes with the fix

Submission Checklist

@g-h-c g-h-c requested a review from yxsamliu June 22, 2026 16:08
@g-h-c g-h-c marked this pull request as ready for review June 22, 2026 16:10
@g-h-c g-h-c requested a review from a team as a code owner June 22, 2026 16:10
Copilot AI review requested due to automatic review settings June 22, 2026 16:10
@g-h-c g-h-c requested review from a team as code owners June 22, 2026 16:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes cooperative_groups::reduce() for thread_block_tile when the parent thread block is 2D/3D by computing the lane/tile mask offset using the block-linear thread rank rather than threadIdx.x.

Changes:

  • Update mask calculation in amd_hip_cooperative_groups_reduce.h to use internal::workgroup::thread_rank() for correct 2D/3D block behavior.
  • Add a new HIP unit test that launches a 3D block (16×4×2) and validates per-tile reductions.
  • Register the new test in the cooperative groups unit-test YAML config (with the same Windows disablement pattern as related tests).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
projects/hip-tests/catch/unit/cooperativeGrps/thread_block_tile.cc Adds a multi-dimensional block tile reduction test and kernel.
projects/hip-tests/catch/config/configs/unit/cooperativeGrps.yaml Enables/configures the new unit test in the cooperativeGrps suite.
projects/clr/hipamd/include/hip/amd_detail/amd_hip_cooperative_groups_reduce.h Fixes mask shifting logic to use block-linear thread rank for multi-dim blocks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread projects/hip-tests/catch/unit/cooperativeGrps/thread_block_tile.cc

@yxsamliu yxsamliu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

g-h-c added 2 commits June 23, 2026 09:12
…ed only on threadIdx.x when it should have used internal::workgroup::thread_rank() instead
@g-h-c g-h-c force-pushed the users/g-h-c/ROCM-26483_cooperative_groups_reduce_mask_fix branch from e7d6c1b to 421a29b Compare June 23, 2026 08:12
@g-h-c g-h-c closed this Jun 29, 2026
@g-h-c

g-h-c commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Fix already included in #5914

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants