Skip to content

Conversation

@chengjunlu
Copy link
Contributor

@chengjunlu chengjunlu commented Nov 18, 2025

  1. Unify the 2D block IO lowering code for both regular pointer and block pointer.
  2. Remove duplicated code .
  3. To support common feature with same code: non-Nan padding, rank > 2 tensor load and transposing load for both block pointer and regular pointer.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the 2D block I/O load lowering code to consolidate duplicate logic between regular pointer and block pointer handling. The refactoring creates shared utilities for unpacking block pointer structures and computing memory access parameters, enabling both pointer types to use the same code path.

Key Changes:

  • Introduces helper functions (unpackLLBlockPointer, getBases, getPitch, getBaseOffsets) to extract and process pointer metadata uniformly
  • Removes the specialized rewriteTensorPointerLoad function (~800 lines) in favor of the unified lowering path
  • Extends support to more layout types and transposed matrix loads for block pointers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chengjunlu chengjunlu marked this pull request as draft November 18, 2025 08:12
@chengjunlu chengjunlu force-pushed the chengjun/unify_2d_block_load_lowering branch from f0d6179 to 6bc5f30 Compare November 28, 2025 02:52
@chengjunlu chengjunlu changed the title [LoadStoreOpToLLVM] Reuse 2D block IO load lowering for both regular pointer and block pointer to clean up duplicate code. [LoadStoreOpToLLVM] Unify the 2D block IO lowering code for both regular pointer and block pointer Nov 28, 2025
@chengjunlu chengjunlu force-pushed the chengjun/unify_2d_block_load_lowering branch from 6bc5f30 to 70554e9 Compare December 1, 2025 05:31
@chengjunlu chengjunlu requested a review from Copilot December 1, 2025 05:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chengjunlu chengjunlu force-pushed the chengjun/unify_2d_block_load_lowering branch from 70554e9 to 17cb97c Compare December 1, 2025 05:55
@chengjunlu chengjunlu requested a review from Copilot December 1, 2025 05:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// throughout the tile.
if (auto baseHeightInt =
mlir::triton::intel::getFoldedConstantValue(baseHeight)) {
if (baseHeightInt < tileHeight && baseHeightInt == 1) {
Copy link

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition baseHeightInt < tileHeight && baseHeightInt == 1 has a logical redundancy. If baseHeightInt == 1, the first condition baseHeightInt < tileHeight is only meaningful when tileHeight > 1. Consider simplifying to if (baseHeightInt == 1 && tileHeight > 1) for clarity.

Suggested change
if (baseHeightInt < tileHeight && baseHeightInt == 1) {
if (baseHeightInt == 1 && tileHeight > 1) {

Copilot uses AI. Check for mistakes.
…lar pointer and block pointer. And clean up duplicate code.

Signed-off-by: Lu,Chengjun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BLOCK IO] Clean up the code to unify the block IO lowering for both both tensor descritpor (block ptr) and teonsr of pointers.

2 participants