Skip to content

Use range parameter attributes to fold sub+icmp u* into icmp s* #134028

@scottmcm

Description

@scottmcm

(This example comes from looking at Rust's discriminant code from https://rust.godbolt.org/z/1138aGqdb)

Take this input IR:

define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr {
start:
  %0 = sub i8 %x, 2
  %1 = zext i8 %0 to i64
  %2 = icmp ule i8 %0, 2
  %3 = add i64 %1, 1
  %_2 = select i1 %2, i64 %3, i64 0
  %_0 = icmp eq i64 %_2, 0
  ret i1 %_0
}

Today, LLVM does simplify it a bunch, getting it down to https://llvm.godbolt.org/z/G7as7Y6o5

define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
  %0 = add nsw i8 %x, -5
  %1 = icmp ult i8 %0, -3
  ret i1 %1
}

It could do better, though. The range information was used to determine the nsw, but that doesn't really help the unsigned icmp.

Specifically, the range restriction is enough that it'd be allowed to be just https://alive2.llvm.org/ce/z/fkVEwL

define noundef zeroext i1 @is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
  %1 = icmp slt i8 %x, 2
  ret i1 %1
}

Eliminating the need for the add altogether.

Activity

scottmcm

scottmcm commented on Apr 2, 2025

@scottmcm
Author

Come to think of it, another way to get to the right place would be to look at the

define noundef zeroext i1 @src(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
  %0 = add nsw i8 %x, -5
  %1 = icmp ult i8 %0, -3
  ret i1 %1
}

And use the range information to notice %0 is strictly negative, and thus change it to https://alive2.llvm.org/ce/z/tWujha

define noundef zeroext i1 @tgt(i8 noundef range(i8 -1, 5) %x) unnamed_addr #0 {
start:
  %0 = add nsw i8 %x, -5
  %1 = icmp samesign slt i8 %0, -3
  ret i1 %1
}

Which would then InstCombine down to the desired icmp slt i8 %s, 2: https://llvm.godbolt.org/z/ET7Kx4nr5

dtcxzyw

dtcxzyw commented on Apr 2, 2025

@dtcxzyw
Member

We can split this into two PRs:

  1. Infer samesign using range info. However, I don't recommend doing this in InstCombine, as it has been handled by CVP.
  2. Handle icmp samesign ult (x +nsw 5), -3 -> icmp slt x, 2 in InstCombinerImpl::foldICmpAddConstant.
llvmbot

llvmbot commented on Apr 2, 2025

@llvmbot
Member

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

llvmbot

llvmbot commented on Apr 2, 2025

@llvmbot
Member

@llvm/issue-subscribers-good-first-issue

Author: None (scottmcm)

(This example comes from looking at Rust's discriminant code from <https://rust.godbolt.org/z/1138aGqdb>)

Take this input IR:

define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr {
start:
  %0 = sub i8 %x, 2
  %1 = zext i8 %0 to i64
  %2 = icmp ule i8 %0, 2
  %3 = add i64 %1, 1
  %_2 = select i1 %2, i64 %3, i64 0
  %_0 = icmp eq i64 %_2, 0
  ret i1 %_0
}

Today, LLVM does simplify it a bunch, getting it down to <https://llvm.godbolt.org/z/G7as7Y6o5>

define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #<!-- -->0 {
  %0 = add nsw i8 %x, -5
  %1 = icmp ult i8 %0, -3
  ret i1 %1
}

It could do better, though. The range information was used to determine the nsw, but that doesn't really help the unsigned icmp.

Specifically, the range restriction is enough that it'd be allowed to be just <https://alive2.llvm.org/ce/z/fkVEwL>

define noundef zeroext i1 @<!-- -->is_foo(i8 noundef range(i8 -1, 5) %x) unnamed_addr #<!-- -->0 {
start:
  %1 = icmp slt i8 %x, 2
  ret i1 %1
}

Eliminating the need for the add altogether.

RAJAGOPALAN-GANGADHARAN

RAJAGOPALAN-GANGADHARAN commented on Apr 2, 2025

@RAJAGOPALAN-GANGADHARAN

@scottmcm @dtcxzyw This looks like an interesting issue, can I give it a shot? Thanks!

dtcxzyw

dtcxzyw commented on Apr 2, 2025

@dtcxzyw
Member

@RAJAGOPALAN-GANGADHARAN Please read https://llvm.org/docs/InstCombineContributorGuide.html before submitting a patch. Good luck :)

RAJAGOPALAN-GANGADHARAN

RAJAGOPALAN-GANGADHARAN commented on Apr 2, 2025

@RAJAGOPALAN-GANGADHARAN

@dtcxzyw Thanks for assigning the issue. So I spent couple of hours trying to understand whats going on and I have some questions regarding the first part:

Infer samesign using range info. However, I don't recommend doing this in InstCombine, as it has been handled by CVP

I can see that samesign is added only at O2 or higher, While the issue has used O1. Is the expectation to make this work at O1? From what I understand that is not the right expectation because all the cmp optimizations are done at O2 level.

scottmcm

scottmcm commented on Apr 2, 2025

@scottmcm
Author

Ah, the -O1 here is my mistake. I'd been using that to avoid vectorization in #134024, but I'm fine if this is only fixed at O2.

That said, if I take the repro and change it to O3, I do see that it gets the samesign https://llvm.godbolt.org/z/Pj1W6vfM8 but it still doesn't simplify down to just an icmp -- the add is still there.

~~So maybe its an awkward phase ordering problem? :/ ~~

Hmm, there is an InstCombine after CVP https://llvm.godbolt.org/z/xdjMGj81c. But somehow it is not already doing this, even though in https://llvm.godbolt.org/z/ET7Kx4nr5 its InstCombine that removes the add.

I have no idea what is going on here.

Oh, I failed to read my own example. Right, InstCombine does fix it when its slt, but its not made into slt yet.

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Participants

    @dtcxzyw@scottmcm@RAJAGOPALAN-GANGADHARAN@llvmbot

    Issue actions

      Use `range` parameter attributes to fold `sub`+`icmp u*` into `icmp s*` · Issue #134028 · llvm/llvm-project