-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-mir-optArea: MIR optimizationsArea: MIR optimizationsE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
I'm trying to figure out why certain iterator-based algorithms get const-optimized, while others don't and I stumbled upon this case.
example (updated 2021.03.05)
Compiling fn1
results in the expected machine code, but fn2
has extra stack reserve/free instructions that are not necessary, since the function's contents are optimized away completely. The stack size equals the size of the array.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-mir-optArea: MIR optimizationsArea: MIR optimizationsE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
tesuji commentedon Aug 28, 2020
Two observations:
-O
in this case optimize code better than-C opt-level=3
&
to array in fn2 also make it optimizable.nikic commentedon Sep 20, 2020
This would very likely be fixed by https://reviews.llvm.org/D87972.
bugadani commentedon Mar 5, 2021
Fixed by #81451
mati865 commentedon Mar 5, 2021
There should be codegen test for this issue.
bugadani commentedon Mar 5, 2021
Yeah, right. I can add those for the issues I closed.
bugadani commentedon Mar 5, 2021
Actually, I'm reopening this one as the issue stands on ARM: https://rust.godbolt.org/z/6fEdoq
nikic commentedon Mar 5, 2021
Looks like a phase ordering problem. The remaining IR would be eliminated by GVN + InstCombine. Currently it only gets eliminated by DAGCombine, thus you still see the prologue/epilogue on ARM.
nikic commentedon Mar 7, 2021
The problem here is that this loop only gets unrolled during full loop unrolling, while we need it to be unrolled during simple unrolling to still have a chance to optimize. For the IR at that point, SCEV cannot determine the loop trip count:
nikic commentedon Apr 3, 2023
Fixed by the LLVM 16 upgrade.
Add codegen tests for issues fixed by LLVM 16
Rollup merge of rust-lang#109895 - nikic:llvm-16-tests, r=cuviper