-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[Clang][Sema] Reject array prvalue operands #140702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Clang][Sema] Reject array prvalue operands #140702
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang Author: None (languagelawyer) Changesof unary Fixes #54016 Full diff: https://github.com/llvm/llvm-project/pull/140702.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index d1889100c382e..c0fa9bc9e895e 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -11333,6 +11333,11 @@ QualType Sema::CheckAdditionOperands(ExprResult &LHS, ExprResult &RHS,
if (!IExp->getType()->isIntegerType())
return InvalidOperands(Loc, LHS, RHS);
+ if (OriginalOperand Orig(PExp); Orig.getType()->isArrayType() && Orig.Orig->isPRValue()) {
+ Diag(Loc, diag::err_typecheck_array_prvalue_operand) << PExp->getSourceRange();
+ return QualType();
+ }
+
// Adding to a null pointer results in undefined behavior.
if (PExp->IgnoreParenCasts()->isNullPointerConstant(
Context, Expr::NPC_ValueDependentIsNotNull)) {
@@ -11429,6 +11434,16 @@ QualType Sema::CheckSubtractionOperands(ExprResult &LHS, ExprResult &RHS,
return compType;
}
+ OriginalOperand OrigLHS(LHS.get()), OrigRHS(RHS.get());
+ bool LHSArrayPRV = OrigLHS.getType()->isArrayType() && OrigLHS.Orig->isPRValue();
+ bool RHSArrayPRV = OrigRHS.getType()->isArrayType() && OrigRHS.Orig->isPRValue();
+ if (LHSArrayPRV || RHSArrayPRV) {
+ auto&& diag = Diag(Loc, diag::err_typecheck_array_prvalue_operand);
+ if (LHSArrayPRV) diag << LHS.get()->getSourceRange();
+ if (RHSArrayPRV) diag << RHS.get()->getSourceRange();
+ return QualType();
+ }
+
// Either ptr - int or ptr - ptr.
if (LHS.get()->getType()->isAnyPointerType()) {
QualType lpointee = LHS.get()->getType()->getPointeeType();
@@ -15840,6 +15855,12 @@ ExprResult Sema::CreateBuiltinUnaryOp(SourceLocation OpLoc,
InputExpr->getType()->isSpecificBuiltinType(BuiltinType::Dependent)) {
resultType = Context.DependentTy;
} else {
+ if (Opc == UO_Deref || Opc == UO_Plus) {
+ if (auto *expr = Input.get(); expr->getType()->isArrayType() && expr->isPRValue()) {
+ Diag(OpLoc, diag::err_typecheck_array_prvalue_operand) << expr->getSourceRange();
+ return ExprError();
+ }
+ }
switch (Opc) {
case UO_PreInc:
case UO_PreDec:
diff --git a/clang/test/CXX/expr/p8.cpp b/clang/test/CXX/expr/p8.cpp
index 471d1c5a30206..f736b88b3db09 100644
--- a/clang/test/CXX/expr/p8.cpp
+++ b/clang/test/CXX/expr/p8.cpp
@@ -1,5 +1,4 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// expected-no-diagnostics
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++11
int a0;
const volatile int a1 = 2;
@@ -16,4 +15,13 @@ int main()
f0(a1);
f1(a2);
f2(a3);
+
+ using IA = int[];
+ void(+IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+ void(*IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+ void(IA{ 1, 2, 3 } + 0); // expected-error {{array prvalue}}
+ void(IA{ 1, 2, 3 } - 0); // expected-error {{array prvalue}}
+ void(0 + IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+ void(0 - IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+ void(IA{ 1, 2, 3 } - IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This is related to CWG2548. CC @Endilll |
Kinda, but not really. It has never been necessary to create that (non-)issue. Or does someone really believe that it is a C++ defect that |
3e262aa
to
f606724
Compare
of unary + and *, and binary + and - operators
f606724
to
df91056
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to break behavior in C? https://godbolt.org/z/PsPPs8ov1
Array prvalues is C++11+ exclusive thing. Compound literals are lvalues in C https://port70.net/~nsz/c/c11/n1570.html#6.5.2.5p4 $ build/bin/clang -fsyntax-only test.c
test.c:2:3: warning: expression result unused [-Wunused-value]
2 | *((int []){ 1, 2, 3});
| ^~~~~~~~~~~~~~~~~~~~~
test.c:3:24: warning: expression result unused [-Wunused-value]
3 | ((int []){ 1, 2, 3}) + 0;
| ~~~~~~~~~~~~~~~~~~~~ ^ ~
2 warnings generated. However, it "breaks" compound literal C++ extension:
but i.g. this is OK/expected |
Ah, good point, I wasn't remembering that C and C++ are different in how they handle compound literals. And you can return a pointer to an array in C, but that's still an lvalue.
Thank you! |
That probably should continue to work -- we accept it today and so does GCC. It's a bit of an oddity, to be sure. But I don't see why it should be rejected either. |
GCC started accepting it from version 14, likely when fixing https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94264 (it is mentioned in #54016 (comment)), but it also started accepting I think, if someone wants 100% C compatibility, then they should make compound literals lvalues. If compound literals are prvalues in C++ extension, then they should be treated like other array prvalues |
Ugh, I was backwards again. They're prvalues in C++ already, not lvalues as in C, so yes, it should be rejected in C++ with your changes. Sorry for the confusion! |
So reading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53220; it was decided to treat |
@pinskia the thing is, GCC started accepting ill-formed code right when it was reaffirmed it is ill-formed |
@@ -16,4 +15,13 @@ int main() | |||
f0(a1); | |||
f1(a2); | |||
f2(a3); | |||
|
|||
using IA = int[]; | |||
void(+IA{ 1, 2, 3 }); // expected-error {{array prvalue}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some tests here to make sure that we don’t complain about array glvalues?
@@ -7639,6 +7639,8 @@ def warn_param_mismatched_alignment : Warning< | |||
|
|||
def err_objc_object_assignment : Error< | |||
"cannot assign to class object (%0 invalid)">; | |||
def err_typecheck_array_prvalue_operand : Error< | |||
"array prvalue is not permitted">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"array prvalue is not permitted">; | |
"operand of '%0' cannot be an array prvalue">; |
I think it’s a bit clearer if we phrase it like this and also print out the operator rather than just saying ‘X is not permitted’.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the operator is pointed at by ^
, like
test.cxx:13:23: error: array prvalue is not permitted
13 | ((int []){ 1, 2, 3}) + 0;
| ~~~~~~~~~~~~~~~~~~~~ ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, in that case, maybe just ‘operand cannot be an array prvalue’. I would prefer at least including the word ‘operand’ so it’s clear that the problem is that you’re passing an array prvalue to this operator, because at the moment the diagnostic makes it sound like array prvalues aren’t permitted at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the operator is pointed at by
^
This part can be removed by using -fno-caret-diagnostics
command line option. In this case, the diagnostic is as followed:
test.cxx:13:23: error: array prvalue is not permitted
Users may get confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zwuis is "operand cannot be an array prvalue"
ok? Comparing to an existing error:
IA{ 1, 2, 3 } + 0.;
IA{ 1, 2, 3 } + 0;
test.cxx:5:16: error: invalid operands to binary expression ('int[3]' and 'double')
test.cxx:6:16: error: operand cannot be an array prvalue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're coming from a C background
What about Python/Java background?
They're not particularly relevant because you don't directly mix Java and C++ code in the same way you do with C and C++ as happens in header files.
IIRC there's no rvalue conversion on the statement expression result so I think that ends up being a prvalue of array type
This is not correct, statement expressions (their "result") undergo array and function pointer decay (independently of the context where they appear), so you're getting a "dangling pointer"
I was incorrectly remembering the behavior from returning a char
and not having it promote to an int
. I can confirm we decay the array: https://godbolt.org/z/zhW1fnsej but you can hit the same concern I had via a typedef
or using
, where the type information is actually slightly helpful in understanding the issue: using foo = int[10]; foo{} + 0;
(keeping in mind that foo{}
could also be behind a macro where it's not easy for the user to spot the {}
and realize there's a temporary involved). So I still find a formulation that includes the type information a bit more user-friendly, but that could be included in a new diagnostic that talks about temporary arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @AaronBallman and @Sirraide that using err_typecheck_invalid_operands
is the better approach here.
The "prvalue" bit is only relevant as far as array decay is concerned (eg : array-to-pointer does not apply).
But ultimately, int[]
+ int
is what we should diagnose, not how we got here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping in mind that
foo{}
could also be behind a macro
I doubt that presenting exact array type vs. just saying "array" would help much in this case. There is note: expanded from macro
to point that something comes from macro expansion. (Not shown for err_typecheck_invalid_operands
, BTW!)
err_typecheck_invalid_operands
is the better approach here
int main()
{
using IA = int[];
IA ia = { 1, 2, 3 };
ia + 0.; // error: invalid operands to binary expression ('int[3]' and 'double')
// Where is the problem? in 'int[3]' or in 'double'?
ia + 0; // no error means the issue was in 'double'?
IA{ 1, 2, 3 } + 0; // error: invalid operands to binary expression ('int[3]' and 'int')
// Huh? What about now?
}
But ultimately,
int[]
+int
is what we should diagnose
The types are not the issue, why give misleading diagnostics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalid operands to binary expression
BTW, I think it is either "in binary expression" or "to binary operator"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalid operands to binary expression
BTW, I think it is either "in binary expression" or "to binary operator"
‘an operand to sth.’ is a common turn of phrase that’s been around for a long time (not just in Clang).
@@ -15840,6 +15859,11 @@ ExprResult Sema::CreateBuiltinUnaryOp(SourceLocation OpLoc, | |||
InputExpr->getType()->isSpecificBuiltinType(BuiltinType::Dependent)) { | |||
resultType = Context.DependentTy; | |||
} else { | |||
if (Opc == UO_Deref || Opc == UO_Plus) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit weird to use an if
statement here when there’s a switch
statement on the same variable right after. I’d probably make this a lambda and then call it in the cases for *
and +
below (the +
case would then also need a [[fallthrough]]
annotation).
I think it is CWG2548; that's what made it clear that array-to-pointer decay does not happen here despite that being inconsistent with the rest of the language (cplusplus/papers#1633) |
CWG2548 is closed as NAD, with no changes to the wording. The fact that array-to-pointer decay does not happen has always been clear
I'd say CWG hallucinated inconsistence, to me there is none |
Correct.
Incorrect; the reason the core issue was filed in the first place was because it seemed like an oversight that the language was inconsistent. Clang was implementing the consistent behavior where array to pointer decay happened. That's why this is CWG2548.
So definitely inconsistent, and EWG agreed that it was, but liked the accidental behavior that came from it. |
Indeed, it only seemed.
Clang is just cutting corners tryna parse C and C++ by the same code using as few lines as possible, so it applied incorrect conversion by batching/applying all of them at once. I think the only motivation of asking for issue instead of fixing the bug was to keep the implementation simpler.
Not okay under [expr.add] and [expr]/8 Imagine binary const auto& r1 = IA{1, 2, 3}[0]; // extends lifetime of the temporary object
const auto& r2 = (IA{1, 2, 3} + 0); // does not extend
const auto& r3 = *(IA{1, 2, 3} + 0); // does not extend
const auto& r4 = *IA{1, 2, 3}; // does not extend how this would be consistent? Quite contrary, this would become inconsistent. |
of unary
+
and*
, and binary+
and-
operatorsFixes #54016