Split IndextoOffset() into offline and online versions #2136

yucai-intel · 2025-10-07T10:14:11Z

Divide indextoOffset() into two versions, offline and online, to reduce runtime overhead and as much as possible.

Copilot

Pull Request Overview

Split IndexToOffset into compile-time (static Dims) and runtime (-1) variants to reduce nvcc compilation time and refactor all kernel call sites to use the new template form. Key changes remove the previous contiguous fast path flag and introduce dimension template parameters across many XPU SYCL kernels.

Introduced IndexToOffset<T, IndexType, Dims> and dynamic specialization with Dims = -1; removed strict/non-strict contiguous branching.
Updated all kernel usages to pass an explicit Dims (positive, -1, or new sentinel -2) and added indexing_kind template parameters in RNN kernels.
Replaced standard SYCL subgroup size attribute with Intel-specific attribute in one kernel.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/comm/TensorInfo.h	Replaced old IndexToOffset implementation with compile-time and runtime (-1) variants, removed contiguous fast path.
src/ATen/native/xpu/sycl/WeightNormKernels.cpp	Updated all IndexToOffset calls to new API (runtime -1).
src/ATen/native/xpu/sycl/TensorModeKernel.cpp	Switched subgroup size attribute to Intel-specific and updated IndexToOffset usage.
src/ATen/native/xpu/sycl/TensorApplyUtils.h	Updated ApplyOp2 to use runtime (-1) IndexToOffset.
src/ATen/native/xpu/sycl/SummaryOpsKernels.cpp	Added ADims/BDims template params and updated IndexToOffset calls.
src/ATen/native/xpu/sycl/Sorting.cpp	Pass compile-time Dim to IndexToOffset.
src/ATen/native/xpu/sycl/ScanUtils.h	Migrated to runtime (-1) IndexToOffset calls.
src/ATen/native/xpu/sycl/RNNKernels.cpp	Added indexing_kind template parameter and adjusted macros to new IndexToOffset signature.
src/ATen/native/xpu/sycl/Indexing.h	Updated offset calculations to new runtime form.
src/ATen/native/xpu/sycl/Indexing.cpp	Added DstDim/SrcDim/IdxDim template params, macros now emit various Dims (including -2) for IndexToOffset.
src/ATen/native/xpu/sycl/Dropout.cpp	Added ADims / BDims template-based offset computation.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/comm/TensorInfo.h

src/ATen/native/xpu/sycl/TensorModeKernel.cpp

Copilot · 2025-10-16T08:47:54Z

src/ATen/native/xpu/sycl/Indexing.cpp

+                  if (selfInfo.dims == 1 && sourceInfo.dims == 1 && indContig) {
+                    auto caller = SMALL_INDEX(
+                        scalar_t, index_t, unsigned int, 1, 1, -2, func_t);


The sentinel value -2 for IdxDim is undocumented and differs from the established -1 dynamic case; it implicitly relies on the primary template's loop skipping logic and produces offset = linearId * stride[0], which would be incorrect if indices_ is not 1D with stride[0]==1. Replace -2 with an explicit dimension (e.g., 1) or unify on -1 with a clear fast path, and document the intent.

yucai-intel added 15 commits October 7, 2025 18:06

Update TensorInfo.h

c58232f

Update Dropout.cpp

e504f42

Update Indexing.cpp

bd2afb2

Update Indexing.h

1dd16c0

Update RNNKernels.cpp

12c63f1

Update Sorting.cpp

3396375

Update SummaryOpsKernels.cpp

c8227fb

Update TensorApplyUtils.h

061e5cb

Update TensorModeKernel.cpp

b757a68

Update WeightNormKernels.cpp

adbbf14

Update ScanUtils.h

2499f21

Update ScanUtils.h

31578e3

Merge branch 'main' into yucai/i2o

75185f8

Merge branch 'main' into yucai/i2o

984e54d

Merge branch 'main' into yucai/i2o

9a29f58

jianyizh approved these changes Oct 15, 2025

View reviewed changes

Merge branch 'main' into yucai/i2o

409fd26

CuiYifeng requested a review from Copilot October 16, 2025 08:45

Copilot AI reviewed Oct 16, 2025

View reviewed changes

yucai-intel added 3 commits October 17, 2025 11:24

Update TensorInfo.h

995a054

Update TensorModeKernel.cpp

8a98735

Merge branch 'main' into yucai/i2o

7be1dc6

CuiYifeng requested a review from guangyey October 17, 2025 05:30

guangyey approved these changes Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Split IndextoOffset() into offline and online versions #2136

Split IndextoOffset() into offline and online versions #2136

Uh oh!

yucai-intel commented Oct 7, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Split IndextoOffset() into offline and online versions #2136

Are you sure you want to change the base?

Split IndextoOffset() into offline and online versions #2136

Uh oh!

Conversation

yucai-intel commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yucai-intel commented Oct 7, 2025 •

edited

Loading