[Draft] Pre-allocate node map in graph's duplicateNodes() #392

adamfidel · 2025-10-27T16:23:28Z

No description provided.

…lang versions Fixed in llvm/llvm-project#156033

…compatible compiler version Requires a compiler with the changes in llvm/llvm-project#122265

In the standard, constraint satisfaction checking is done on the normalized form of a constraint. Clang instead substitutes on the non-normalized form, which causes us to report substitution failures in template arguments or concept ids, which is non-conforming but unavoidable without a parameter mapping This patch normalizes before satisfaction checking. However, we preserve concept-id nodes in the normalized form, solely for diagnostics purposes. This addresses #61811 and related concepts conformance bugs, ideally to make the remaining implementation of concept template parameters easier Fixes #135190 Fixes #61811 Co-authored-by: Younan Zhang <[email protected]>

These will be used in upcoming RPC support patches where the outer Expected value captures any RPC-infrastructure errors, and the inner Error is returned from the romet call (i.e. the remote handler's return type is Error).

…compiler versions Skip tests that require `-gstructor-decl-linkage-names` on Clang versions that don't support it. Don't pass `-gno-structor-decl-linkage-names` on Clang versions where it the flag didn't exist but it was the default behaviour of the compiler anyway.

… (#161623) Add `UnitTests` as an explicit dependency for `check-llvm` and `llvm-test-depends`. In llvm/llvm-project#161442, the intent was to remove `UnitTests` as a dependency for the individual per-directory `check-llvm-*` test suites created but not to drop it from `check-llvm` or `llvm-test-depends`. This missing dependency will cause LLVM unit tests to be not rebuilt and resulting in either `warning: test suite 'LLVM-Unit' contained no tests` or running stale running versions of the unit tests when running `check-llvm`.

…older compiler versions Skip tests that require `-gstructor-decl-linkage-names` on Clang versions that don't support it. Don't pass `-gno-structor-decl-linkage-names` on Clang versions where it the flag didn't exist but it was the default behaviour of the compiler anyway. Drive-by: - We used to run `self.expect("Bar()")` which would always fail. So the `error=True` would be true even if we didn't pass the `-gno-structor-linkage-names`. So it wasn't testing the behaviour properly. This patch changes these to `self.expect("expr Bar()")`.

Change top-level and LLVM/MLIR/Clang `.clang-format` files to enforce Unix line ending.

…540) Private only does 'init' when a constructor needs to be called, so this patch adds that. The logic of what to init is caused by Sema, but the tests show that types that are pointers or non-class-types or class types without a constructor aren't actually initialized.

Improve the automatic naming of variables defined by the `omp.canonical_loop` operation: 1. The iteration variable gets a name consistent with the cli variable 2. Instead of appending `_s0` for each nesting level, shorten it to `_d<num>` for a perfectly nested loop at depth `<num>` 3. Do not add any suffix to the top-level loop if it is the only top-level loop

After llvm/llvm-project#154336 this fold no longer triggers, as the freeze will be pushed through to the icmp operands, and generic handling will take care of it.

Instead of checking if the recoloring candidate is a virtual register, avoid adding it to the candidates in the first place.

This splits out "ScalablePredicateVector" from the "ScalableVector" StackID this is primarily to allow easy differentiation between vectors and predicates (without inspecting instructions). This new stack ID is not used in many places yet, but will be used in a later patch to mark stack slots that are known to contain predicates. Co-authored-by: Kerry McLaughlin <[email protected]>

…n" (#161669) Reverts llvm/llvm-project#141776 CI failures https://lab.llvm.org/buildbot/#/builders/202/builds/3591 https://lab.llvm.org/buildbot/#/builders/55/builds/18066 https://lab.llvm.org/buildbot/#/builders/85/builds/14103

Treat them as namespaces: if they are at the beginning of the line, they are likely a good recovery point. For instance, in ```cpp 1.3.0 extern "C" { extern int foo(); extern "C++" { namespace bar { void baz(); }; } } namespace {} ``` Everything until `namespace`... is gone from the AST. Headers (like libc's C++ `math.h`) can be included from an `extern "C"` context, and they do an `extern "C++"` back again before including C++ headers (like `__type_traits`). However, a malformed declaration just before the include (as the orphan `1.3.0` in the example) causes everything from these standard headers to go missing. This patch updates the heuristic to try to recover from the first `extern` keyword seen, pretty much as it is done for `namespace`. CPP-4478

This patch introduces some missing FP conversion instructions in the ROCDL dialect Specifically: - Downscaling 8x packed F16, Bf16, Fp32 values to Fp8, Bf8, Fp4 Tests: - Added lit-tests to check MLIR -> LLVM lowering

CONFLICT (modify/delete): .github/workflows/llvm-project-tests.yml deleted in HEAD and modified in c1ee2e7. Version c1ee2e7 of .github/workflows/llvm-project-tests.yml left in tree.

Previously this took hints from subregister extract of physreg, like %vreg.sub = COPY $physreg This now also handles the rarer case: $physreg_sub = COPY %vreg Also make an accidental bug here before explicit; this was only using the superregister as a hint if it was already in the copy, and not if using the existing assignment. There are a handful of regressions in that case, so leave that extension for a future change.

…61618) Check for a valid offset for unaligned vector store V6_vS32Ub_npred_ai. isValidOffset() is updated to evaluate offset of this instruction. Fixes #160647

… Fortran interop (#161613)

…(#161433) add correct names for `NB_TYPE_CASTER(..., name)` so users of `NanobindAdaptors.h` can generate the correct hints. Also fix a few straggler stubs.

…missing an argument (#161277) Crash report came in and it was pretty obvious the diagnostic line was just missing an argument. I supplied the argument and added a test. Fixes: llvm/llvm-project#161072

This patch attempts to refactor AArch64FrameLowering to allow the size of the ZPR and PPR areas to be calculated separately. This will be used by a subsequent patch to support allocating ZPRs and PPRs to separate areas. This patch should be an NFC and is split out to make later functional changes easier to spot. Co-authored-by: Kerry McLaughlin <[email protected]>

Follow the community change that renames the section for the offload entries from omp_offloading_entries to llvm_offload_entries.

…ion names (intel#20488) Currently, we have getKernelNamesUsingAssert to detect all SPIR kernels which use assert functions via BFS. This PR extend it in 2 points: 1. Support passing a general function name to it for detecting all SPIR kernels using it 2. Support passing a group of special function names for searching SPIR kernels using them. We may need to check other special function and can easily call it instead of adding a new getKernelNamesUsingXXX. --------- Signed-off-by: jinge90 <[email protected]> Co-authored-by: Marcos Maronas <[email protected]>

) Change tracing functions to avoid temporary `std::string_view` object creation. The issue was that if a string literal is passed as a `Msg` parameter, the temporary `std::string_view` object was created. When the constructor of the `std::string_view` is called, internally it calculates the length of the C-string. This PR avoids that.

Also include a few NFC changes.

…ns (intel#20422) **Problems** Problem 1: When a library consisting of free function kernels is registered with SYCL RT, we store pointers (as `string_view`) to free function names in `m_FreeFunctionKernelGlobalInfo` but we do not remove them from `m_FreeFunctionKernelGlobalInfo` when the library is unloaded. Thus, we end up holding dangling pointers and any further operation on `m_FreeFunctionKernelGlobalInfo` might segfault. Problem 2: Consider the case when you have multiple TUs with free functions and they are compiled separately but linked together into a single shared lib. In that case, we will have multiple definition of `static GlobalMapUpdater updater` in the shared lib => violating ODR **Solution** Discard pointers to free function names when library is unloaded and have `GlobalMapUpdater` defined in anonymous namespace, instead of `sycl::v1::detail` --------- Co-authored-by: premanandrao <[email protected]>

When action is used as composite, its steps don't disaply the actual 'name' defined for each step. Instead a full command to execute is displayed. To work around it we can put name as a comment in the first line of the command.

intel#20450)

…ntel#20490) This commit makes the UR interfaces implementing urIPCGetMemHandleExp return the exact memory pointer from UMF instead of copying the handle data. This is done as the exact pointer is expected to be passed when calling umfPutIPCHandle in the urIPCPutMemHandleExp interface. --------- Signed-off-by: Larsen, Steffen <[email protected]>

…ntel#20492) This commit removes a set of test cases for invalid uses of the add_ir_attributes_* attributes in the merging test checking the generated AST. --------- Signed-off-by: Larsen, Steffen <[email protected]>

…#20399) SPIR-V OpenCL.ExtendedInstructionSet.100 requires all of the operands must be the same type for mix.

Fix a regression from 36578fe which deleted scalar implementation. Additional changes: * Don't scalarize __spirv_ocl_ldexp since __clc_ldexp could be vectorized. * fix build warning in clc/integer/clc_bitfield_insert.h.

This commit makes the following changes to the behavior of asynchronous exception handling: 1. The death of a queue should not consume asynchronous exceptions. 2. Calling wait_and_throw on an event after the associated queue has died should still consume exceptions that were originally associated with the queue. This should respect the async_handler priority to the best of its ability. 3. Calling wait_and_throw or throw_asynchronous on a queue without an async_handler should fall back to using the async_handler of the associated context, then the default async_handler if none were attached to the context. 4. Throwing asynchronous exceptions from anywhere will now consume all unconsumed asynchronous exceptions previously reported, no matter the event/queue/context/device. Additionally, this lays the ground work for intel#20266 by moving the tracking of unconsumed asynchronous exception to the devices. --------- Signed-off-by: Larsen, Steffen <[email protected]>

Bump Compute Benchmarks commit to latest: - fix build failures of Compute Benchmarks caused by changed definition of urProgramBuildExp() - add queue synchronization to SubmitKernel warmup - fix submit_graph_l0 standard, non-emulated path

…intel#20333) This PR: - moves benchmarking CI data to `benchmark-ci-tests` branch instead of intel/llvm-ci-perf-results - removes the need for an additional bot user when pushing data (as well as the need to periodically update its tokens) Test run: https://github.com/intel/llvm/actions/runs/18386952310/job/52388325821 (job failed due to "regression"; issue should be addressed with the merge of intel#20277

This change makes the commands submitted to the scheduler unconditionally associated with an event (for both event and event-less APIs), for kernel submission. For other commands, the event can be skipped if the scheduler bypass condition is true (the scheduler bypass itself is not supported for commands other than the kernel submission), if the queue supports discarding the events and the event was not requested. The reason for this change is, that some commands might already be scheduled and waiting for the submission, so all the kernel submission commands subsequently submitted to the scheduler must return an event, which is then used to order the commands by the in-order type queue and avoid scheduler-bypass flow in such a case. On the other hand, if the scheduler bypass condition is true for a command other than the kernel submission, the event dependencies are safe for scheduler bypass, so the event is not needed. --------- Co-authored-by: Sergey Semenov <[email protected]>

intel#20524)

…l. (intel#20482) Initially, the sycl-linker-wrapper-win.cpp test was introduced to check .exe extensions. This change adapts the sycl-linker-wrapper.cpp test to any extensions. Also, sycl-post-link-options-win.cpp test is removed because corresponding checks in the test are repeated in sycl-post-link-options.cpp test. Fixes: intel#17754 --------- Co-authored-by: Alexey Bader <[email protected]>

This reverts commit ebcf025.

use v2025.07.22 OpenCL headers/loader see intel#20511 for details

LLVM: llvm/llvm-project@2d67cb1 SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@54525b6

…there are no subgraphs

Michael137 and others added 30 commits October 2, 2025 13:30

[lldb][test] Un-XFAIL TestDataFormatterStdUnorderedMap.py for older C…

86ba198

…lang versions Fixed in llvm/llvm-project#156033

[lldb][test] TestStructuredBinding.py: adjust assertion to check for …

7e6d277

…compatible compiler version Requires a compiler with the changes in llvm/llvm-project#122265

[GVN] Add additional tests for inverted condition propagation (NFC)

db39ef9

[orc-rt] Fix typo in comment. NFC.

0cb9d40

AMDGPU: Switch test to generated checks (#161658)

e7839ee

Enforce Unix line endings for Clang/LLVM/MLIR projects (#161460)

2ece31b

Change top-level and LLVM/MLIR/Clang `.clang-format` files to enforce Unix line ending.

[InstCombine] Remove foldSelectWithFrozenICmp() fold (#161659)

e48fe76

After llvm/llvm-project#154336 this fold no longer triggers, as the freeze will be pushed through to the icmp operands, and generic handling will take care of it.

Greedy: Move physreg check when trying to recolor vregs (NFC) (#160484)

c4e1bca

Instead of checking if the recoloring candidate is a virtual register, avoid adding it to the candidates in the first place.

Greedy: Merge VirtRegMap queries into one use (NFC) (#160485)

706b790

[ROCDL] Added rocdl.cvt.scale.pk8 ops (#161411)

b92ff6b

This patch introduces some missing FP conversion instructions in the ROCDL dialect Specifically: - Downscaling 8x packed F16, Bf16, Fp32 values to Fp8, Bf8, Fp4 Tests: - Added lit-tests to check MLIR -> LLVM lowering

Merge from 'sycl' to 'sycl-web' (3 commits)

f8987f4

CONFLICT (modify/delete): .github/workflows/llvm-project-tests.yml deleted in HEAD and modified in c1ee2e7. Version c1ee2e7 of .github/workflows/llvm-project-tests.yml left in tree.

Greedy: Use initializer list for recoloring candidates (NFC) (#160486)

f98735f

[Hexagon] Add opcode V6_vS32Ub_npred_ai for offset validity check (#1…

daa4e57

…61618) Check for a valid offset for unaligned vector store V6_vS32Ub_npred_ai. isValidOffset() is updated to evaluate offset of this instruction. Fixes #160647

[flang][cuda][openacc] Create new symbol in host_data region for CUDA…

c242aff

… Fortran interop (#161613)

Merge from 'sycl' to 'sycl-web' (7 commits)

b79ee5a

[X86] Create special case for (a-b) - (a<b) -> sbb a, b (#161388)

197e77b

[MLIR][Python] fixup Context and Location stubs and NanobindAdaptors …

a3594cd

…(#161433) add correct names for `NB_TYPE_CASTER(..., name)` so users of `NanobindAdaptors.h` can generate the correct hints. Also fix a few straggler stubs.

[Clang][Sema] Fix crash in CheckUsingDeclQualifier due to diagnostic …

32d03f3

…missing an argument (#161277) Crash report came in and it was pretty obvious the diagnostic line was just missing an argument. I supplied the argument and added a test. Fixes: llvm/llvm-project#161072

hansangbae and others added 27 commits October 30, 2025 16:04

[Offload] Change section names for offload entries (intel#20509)

bab58c1

Follow the community change that renames the section for the offload entries from omp_offloading_entries to llvm_offload_entries.

[libspirv] Add __spirv_BitReverse implementation (intel#20449)

faadbce

Also include a few NFC changes.

[libspirv] Add __spirv_BitFieldUExtract/SExtract/Insert implementation (

a3f4959

intel#20450)

[libspirv] delete mix with vector type x and scalar type a (intel…

7035592

…#20399) SPIR-V OpenCL.ExtendedInstructionSet.100 requires all of the operands must be the same type for mix.

[SYCL][Doc] Refresh info about releases (intel#20517)

3dfb538

Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown

5f72df7

Disable bindless images tests with spirv backend.

2e275cc

[SYCL] Extend no-handler submission path to support kernel properties. (

d5d8b19

intel#20524)

Revert "[SPIR-V] Implement SPV_KHR_float_controls2 (#146941)"

8b0d54a

This reverts commit ebcf025.

[SYCL] Workaround older MSVC's bug (intel#20539)

92a006c

[UR][OpenCL] use v2025.07.22 OpenCL headers/loader (intel#20533)

12a0dd4

use v2025.07.22 OpenCL headers/loader see intel#20511 for details

LLVM and SPIRV-LLVM-Translator pulldown (WW42 2025)

6fbd7f0

LLVM: llvm/llvm-project@2d67cb1 SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@54525b6

Pre-allocate node map in graph's duplicateNodes()

bb8f6ce

Replace temp node deque with a vector + skip second subgraph pass if …

e3b13d9

…there are no subgraphs

reserve NewNodes, remove commented code

87af057

adamfidel force-pushed the adam/duplicate-nodes-unordered-map branch from a371146 to 87af057 Compare November 4, 2025 15:19

remove TODO

f28ca12

adamfidel closed this Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Pre-allocate node map in graph's duplicateNodes() #392

[Draft] Pre-allocate node map in graph's duplicateNodes() #392

Uh oh!

adamfidel commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[Draft] Pre-allocate node map in graph's duplicateNodes() #392

[Draft] Pre-allocate node map in graph's duplicateNodes() #392

Uh oh!

Conversation

adamfidel commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants