Skip to content

Conversation

@SanityRemnants
Copy link
Contributor

Ensures num_items is recalculated after updating group_width when num_items is less than max_sg_sz, preventing access to wrong memory later and fixes prod reduction for small kernels. Fixes some UTs from #1818.

Ensures num_items is recalculated after updating group_width when num_items is less than max_sg_sz, preventing incorrect parallelism configuration.
Copilot AI review requested due to automatic review settings December 8, 2025 10:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the reduction kernel configuration logic where num_items was not being updated after adjusting group_width for small kernels. The fix ensures correct memory access patterns and resolves product reduction failures in unit tests.

Key Changes:

  • Recalculate num_items after updating group_width when the initial num_items is less than max_sg_sz

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants