Open
Description
(This example comes from looking at Rust's discriminant code from https://rust.godbolt.org/z/1138aGqdb)
Take this input IR:
define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr {
start:
%0 = sub i8 %x, 2
%1 = zext i8 %0 to i64
%2 = icmp ule i8 %0, 2
%3 = add i64 %1, 1
%_2 = select i1 %2, i64 %3, i64 0
%_0 = icmp eq i64 %_2, 0
ret i1 %_0
}
Today, LLVM does simplify it a bunch, getting it down to https://llvm.godbolt.org/z/G7as7Y6o5
define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
%0 = add nsw i8 %x, -5
%1 = icmp ult i8 %0, -3
ret i1 %1
}
It could do better, though. The range information was used to determine the nsw
, but that doesn't really help the unsigned icmp
.
Specifically, the range
restriction is enough that it'd be allowed to be just https://alive2.llvm.org/ce/z/fkVEwL
define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
%1 = icmp slt i8 %x, 2
ret i1 %1
}
Eliminating the need for the add
altogether.
Activity
scottmcm commentedon Apr 2, 2025
Come to think of it, another way to get to the right place would be to look at the
And use the range information to notice
%0
is strictly negative, and thus change it to https://alive2.llvm.org/ce/z/tWujhaWhich would then
InstCombine
down to the desiredicmp slt i8 %s, 2
: https://llvm.godbolt.org/z/ET7Kx4nr5dtcxzyw commentedon Apr 2, 2025
We can split this into two PRs:
samesign
using range info. However, I don't recommend doing this in InstCombine, as it has been handled by CVP.icmp samesign ult (x +nsw 5), -3 -> icmp slt x, 2
inInstCombinerImpl::foldICmpAddConstant
.llvmbot commentedon Apr 2, 2025
Hi!
This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
test/
create fine-grained testing targets, so you can e.g. usemake check-clang-ast
to only run Clang's AST tests.git clang-format HEAD~1
to format your changes.If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.
llvmbot commentedon Apr 2, 2025
@llvm/issue-subscribers-good-first-issue
Author: None (scottmcm)
Take this input IR:
Today, LLVM does simplify it a bunch, getting it down to <https://llvm.godbolt.org/z/G7as7Y6o5>
It could do better, though. The range information was used to determine the
nsw
, but that doesn't really help the unsignedicmp
.Specifically, the
range
restriction is enough that it'd be allowed to be just <https://alive2.llvm.org/ce/z/fkVEwL>Eliminating the need for the
add
altogether.RAJAGOPALAN-GANGADHARAN commentedon Apr 2, 2025
@scottmcm @dtcxzyw This looks like an interesting issue, can I give it a shot? Thanks!
dtcxzyw commentedon Apr 2, 2025
@RAJAGOPALAN-GANGADHARAN Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting a patch. Good luck :)
RAJAGOPALAN-GANGADHARAN commentedon Apr 2, 2025
@dtcxzyw Thanks for assigning the issue. So I spent couple of hours trying to understand whats going on and I have some questions regarding the first part:
I can see that samesign is added only at O2 or higher, While the issue has used O1. Is the expectation to make this work at O1? From what I understand that is not the right expectation because all the cmp optimizations are done at O2 level.
scottmcm commentedon Apr 2, 2025
Ah, the
-O1
here is my mistake. I'd been using that to avoid vectorization in #134024, but I'm fine if this is only fixed at O2.That said, if I take the repro and change it to O3, I do see that it gets the
samesign
https://llvm.godbolt.org/z/Pj1W6vfM8 but it still doesn't simplify down to just anicmp
-- theadd
is still there.~~So maybe its an awkward phase ordering problem? :/ ~~
Hmm, there is an InstCombine after CVP https://llvm.godbolt.org/z/xdjMGj81c. But somehow it is not already doing this, even though in https://llvm.godbolt.org/z/ET7Kx4nr5 its InstCombine that removes theadd
.I have no idea what is going on here.Oh, I failed to read my own example. Right, InstCombine does fix it when its
slt
, but its not made intoslt
yet.8 remaining items