forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 2
[Draft] Pre-allocate node map in graph's duplicateNodes() #392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…lang versions Fixed in llvm/llvm-project#156033
…compatible compiler version Requires a compiler with the changes in llvm/llvm-project#122265
In the standard, constraint satisfaction checking is done on the normalized form of a constraint. Clang instead substitutes on the non-normalized form, which causes us to report substitution failures in template arguments or concept ids, which is non-conforming but unavoidable without a parameter mapping This patch normalizes before satisfaction checking. However, we preserve concept-id nodes in the normalized form, solely for diagnostics purposes. This addresses #61811 and related concepts conformance bugs, ideally to make the remaining implementation of concept template parameters easier Fixes #135190 Fixes #61811 Co-authored-by: Younan Zhang <[email protected]>
These will be used in upcoming RPC support patches where the outer Expected value captures any RPC-infrastructure errors, and the inner Error is returned from the romet call (i.e. the remote handler's return type is Error).
…compiler versions Skip tests that require `-gstructor-decl-linkage-names` on Clang versions that don't support it. Don't pass `-gno-structor-decl-linkage-names` on Clang versions where it the flag didn't exist but it was the default behaviour of the compiler anyway.
… (#161623) Add `UnitTests` as an explicit dependency for `check-llvm` and `llvm-test-depends`. In llvm/llvm-project#161442, the intent was to remove `UnitTests` as a dependency for the individual per-directory `check-llvm-*` test suites created but not to drop it from `check-llvm` or `llvm-test-depends`. This missing dependency will cause LLVM unit tests to be not rebuilt and resulting in either `warning: test suite 'LLVM-Unit' contained no tests` or running stale running versions of the unit tests when running `check-llvm`.
…older compiler versions
Skip tests that require `-gstructor-decl-linkage-names` on Clang versions that don't support it.
Don't pass `-gno-structor-decl-linkage-names` on Clang versions where it the flag didn't exist but it was the default behaviour of the compiler anyway.
Drive-by:
- We used to run `self.expect("Bar()")` which would always fail. So the `error=True` would be true even if we didn't pass the `-gno-structor-linkage-names`. So it wasn't testing the behaviour properly. This patch changes these to `self.expect("expr Bar()")`.
Change top-level and LLVM/MLIR/Clang `.clang-format` files to enforce Unix line ending.
…540) Private only does 'init' when a constructor needs to be called, so this patch adds that. The logic of what to init is caused by Sema, but the tests show that types that are pointers or non-class-types or class types without a constructor aren't actually initialized.
Improve the automatic naming of variables defined by the `omp.canonical_loop` operation: 1. The iteration variable gets a name consistent with the cli variable 2. Instead of appending `_s0` for each nesting level, shorten it to `_d<num>` for a perfectly nested loop at depth `<num>` 3. Do not add any suffix to the top-level loop if it is the only top-level loop
After llvm/llvm-project#154336 this fold no longer triggers, as the freeze will be pushed through to the icmp operands, and generic handling will take care of it.
Instead of checking if the recoloring candidate is a virtual register, avoid adding it to the candidates in the first place.
This splits out "ScalablePredicateVector" from the "ScalableVector" StackID this is primarily to allow easy differentiation between vectors and predicates (without inspecting instructions). This new stack ID is not used in many places yet, but will be used in a later patch to mark stack slots that are known to contain predicates. Co-authored-by: Kerry McLaughlin <[email protected]>
Treat them as namespaces: if they are at the beginning of the line, they
are likely a good recovery point.
For instance, in
```cpp
1.3.0
extern "C" {
extern int foo();
extern "C++" {
namespace bar {
void baz();
};
}
}
namespace {}
```
Everything until `namespace`... is gone from the AST. Headers (like
libc's C++ `math.h`) can be included from an `extern "C"` context, and
they do an `extern "C++"` back again before including C++ headers (like
`__type_traits`).
However, a malformed declaration just before the include (as the orphan
`1.3.0` in the example) causes everything from these standard headers to
go missing. This patch updates the heuristic to try to recover from the
first `extern` keyword seen, pretty much as it is done for `namespace`.
CPP-4478
This patch introduces some missing FP conversion instructions in the ROCDL dialect Specifically: - Downscaling 8x packed F16, Bf16, Fp32 values to Fp8, Bf8, Fp4 Tests: - Added lit-tests to check MLIR -> LLVM lowering
Previously this took hints from subregister extract of physreg, like %vreg.sub = COPY $physreg This now also handles the rarer case: $physreg_sub = COPY %vreg Also make an accidental bug here before explicit; this was only using the superregister as a hint if it was already in the copy, and not if using the existing assignment. There are a handful of regressions in that case, so leave that extension for a future change.
…61618) Check for a valid offset for unaligned vector store V6_vS32Ub_npred_ai. isValidOffset() is updated to evaluate offset of this instruction. Fixes #160647
… Fortran interop (#161613)
…(#161433) add correct names for `NB_TYPE_CASTER(..., name)` so users of `NanobindAdaptors.h` can generate the correct hints. Also fix a few straggler stubs.
…missing an argument (#161277) Crash report came in and it was pretty obvious the diagnostic line was just missing an argument. I supplied the argument and added a test. Fixes: llvm/llvm-project#161072
This patch attempts to refactor AArch64FrameLowering to allow the size of the ZPR and PPR areas to be calculated separately. This will be used by a subsequent patch to support allocating ZPRs and PPRs to separate areas. This patch should be an NFC and is split out to make later functional changes easier to spot. Co-authored-by: Kerry McLaughlin <[email protected]>
Follow the community change that renames the section for the offload entries from omp_offloading_entries to llvm_offload_entries.
…ion names (intel#20488) Currently, we have getKernelNamesUsingAssert to detect all SPIR kernels which use assert functions via BFS. This PR extend it in 2 points: 1. Support passing a general function name to it for detecting all SPIR kernels using it 2. Support passing a group of special function names for searching SPIR kernels using them. We may need to check other special function and can easily call it instead of adding a new getKernelNamesUsingXXX. --------- Signed-off-by: jinge90 <[email protected]> Co-authored-by: Marcos Maronas <[email protected]>
) Change tracing functions to avoid temporary `std::string_view` object creation. The issue was that if a string literal is passed as a `Msg` parameter, the temporary `std::string_view` object was created. When the constructor of the `std::string_view` is called, internally it calculates the length of the C-string. This PR avoids that.
Also include a few NFC changes.
…ns (intel#20422) **Problems** Problem 1: When a library consisting of free function kernels is registered with SYCL RT, we store pointers (as `string_view`) to free function names in `m_FreeFunctionKernelGlobalInfo` but we do not remove them from `m_FreeFunctionKernelGlobalInfo` when the library is unloaded. Thus, we end up holding dangling pointers and any further operation on `m_FreeFunctionKernelGlobalInfo` might segfault. Problem 2: Consider the case when you have multiple TUs with free functions and they are compiled separately but linked together into a single shared lib. In that case, we will have multiple definition of `static GlobalMapUpdater updater` in the shared lib => violating ODR **Solution** Discard pointers to free function names when library is unloaded and have `GlobalMapUpdater` defined in anonymous namespace, instead of `sycl::v1::detail` --------- Co-authored-by: premanandrao <[email protected]>
When action is used as composite, its steps don't disaply the actual 'name' defined for each step. Instead a full command to execute is displayed. To work around it we can put name as a comment in the first line of the command.
…ntel#20490) This commit makes the UR interfaces implementing urIPCGetMemHandleExp return the exact memory pointer from UMF instead of copying the handle data. This is done as the exact pointer is expected to be passed when calling umfPutIPCHandle in the urIPCPutMemHandleExp interface. --------- Signed-off-by: Larsen, Steffen <[email protected]>
…ntel#20492) This commit removes a set of test cases for invalid uses of the add_ir_attributes_* attributes in the merging test checking the generated AST. --------- Signed-off-by: Larsen, Steffen <[email protected]>
…#20399) SPIR-V OpenCL.ExtendedInstructionSet.100 requires all of the operands must be the same type for mix.
Fix a regression from 36578fe which deleted scalar implementation. Additional changes: * Don't scalarize __spirv_ocl_ldexp since __clc_ldexp could be vectorized. * fix build warning in clc/integer/clc_bitfield_insert.h.
This commit makes the following changes to the behavior of asynchronous exception handling: 1. The death of a queue should not consume asynchronous exceptions. 2. Calling wait_and_throw on an event after the associated queue has died should still consume exceptions that were originally associated with the queue. This should respect the async_handler priority to the best of its ability. 3. Calling wait_and_throw or throw_asynchronous on a queue without an async_handler should fall back to using the async_handler of the associated context, then the default async_handler if none were attached to the context. 4. Throwing asynchronous exceptions from anywhere will now consume all unconsumed asynchronous exceptions previously reported, no matter the event/queue/context/device. Additionally, this lays the ground work for intel#20266 by moving the tracking of unconsumed asynchronous exception to the devices. --------- Signed-off-by: Larsen, Steffen <[email protected]>
Bump Compute Benchmarks commit to latest: - fix build failures of Compute Benchmarks caused by changed definition of urProgramBuildExp() - add queue synchronization to SubmitKernel warmup - fix submit_graph_l0 standard, non-emulated path
…intel#20333) This PR: - moves benchmarking CI data to `benchmark-ci-tests` branch instead of intel/llvm-ci-perf-results - removes the need for an additional bot user when pushing data (as well as the need to periodically update its tokens) Test run: https://github.com/intel/llvm/actions/runs/18386952310/job/52388325821 (job failed due to "regression"; issue should be addressed with the merge of intel#20277
This change makes the commands submitted to the scheduler unconditionally associated with an event (for both event and event-less APIs), for kernel submission. For other commands, the event can be skipped if the scheduler bypass condition is true (the scheduler bypass itself is not supported for commands other than the kernel submission), if the queue supports discarding the events and the event was not requested. The reason for this change is, that some commands might already be scheduled and waiting for the submission, so all the kernel submission commands subsequently submitted to the scheduler must return an event, which is then used to order the commands by the in-order type queue and avoid scheduler-bypass flow in such a case. On the other hand, if the scheduler bypass condition is true for a command other than the kernel submission, the event dependencies are safe for scheduler bypass, so the event is not needed. --------- Co-authored-by: Sergey Semenov <[email protected]>
…l. (intel#20482) Initially, the sycl-linker-wrapper-win.cpp test was introduced to check .exe extensions. This change adapts the sycl-linker-wrapper.cpp test to any extensions. Also, sycl-post-link-options-win.cpp test is removed because corresponding checks in the test are repeated in sycl-post-link-options.cpp test. Fixes: intel#17754 --------- Co-authored-by: Alexey Bader <[email protected]>
This reverts commit ebcf025.
use v2025.07.22 OpenCL headers/loader see intel#20511 for details
LLVM: llvm/llvm-project@2d67cb1 SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@54525b6
…there are no subgraphs
a371146 to
87af057
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.