[Clang][Sema] Reject array prvalue operands #140702

languagelawyer · 2025-05-20T10:00:09Z

of unary + and *, and binary + and - operators

github-actions · 2025-05-20T10:00:30Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-05-20T10:01:03Z

@llvm/pr-subscribers-clang

Author: None (languagelawyer)

Changes

of unary + and *, and binary + and - operators

Fixes #54016

Full diff: https://github.com/llvm/llvm-project/pull/140702.diff

2 Files Affected:

(modified) clang/lib/Sema/SemaExpr.cpp (+21)
(modified) clang/test/CXX/expr/p8.cpp (+10-2)

diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index d1889100c382e..c0fa9bc9e895e 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -11333,6 +11333,11 @@ QualType Sema::CheckAdditionOperands(ExprResult &LHS, ExprResult &RHS,
   if (!IExp->getType()->isIntegerType())
     return InvalidOperands(Loc, LHS, RHS);
 
+  if (OriginalOperand Orig(PExp); Orig.getType()->isArrayType() && Orig.Orig->isPRValue()) {
+    Diag(Loc, diag::err_typecheck_array_prvalue_operand) << PExp->getSourceRange();
+    return QualType();
+  }
+
   // Adding to a null pointer results in undefined behavior.
   if (PExp->IgnoreParenCasts()->isNullPointerConstant(
           Context, Expr::NPC_ValueDependentIsNotNull)) {
@@ -11429,6 +11434,16 @@ QualType Sema::CheckSubtractionOperands(ExprResult &LHS, ExprResult &RHS,
     return compType;
   }
 
+  OriginalOperand OrigLHS(LHS.get()), OrigRHS(RHS.get());
+  bool LHSArrayPRV = OrigLHS.getType()->isArrayType() && OrigLHS.Orig->isPRValue();
+  bool RHSArrayPRV = OrigRHS.getType()->isArrayType() && OrigRHS.Orig->isPRValue();
+  if (LHSArrayPRV || RHSArrayPRV) {
+    auto&& diag = Diag(Loc, diag::err_typecheck_array_prvalue_operand);
+    if (LHSArrayPRV) diag << LHS.get()->getSourceRange();
+    if (RHSArrayPRV) diag << RHS.get()->getSourceRange();
+    return QualType();
+  }
+
   // Either ptr - int   or   ptr - ptr.
   if (LHS.get()->getType()->isAnyPointerType()) {
     QualType lpointee = LHS.get()->getType()->getPointeeType();
@@ -15840,6 +15855,12 @@ ExprResult Sema::CreateBuiltinUnaryOp(SourceLocation OpLoc,
       InputExpr->getType()->isSpecificBuiltinType(BuiltinType::Dependent)) {
     resultType = Context.DependentTy;
   } else {
+    if (Opc == UO_Deref || Opc == UO_Plus) {
+      if (auto *expr = Input.get(); expr->getType()->isArrayType() && expr->isPRValue()) {
+        Diag(OpLoc, diag::err_typecheck_array_prvalue_operand) << expr->getSourceRange();
+        return ExprError();
+      }
+    }
     switch (Opc) {
     case UO_PreInc:
     case UO_PreDec:
diff --git a/clang/test/CXX/expr/p8.cpp b/clang/test/CXX/expr/p8.cpp
index 471d1c5a30206..f736b88b3db09 100644
--- a/clang/test/CXX/expr/p8.cpp
+++ b/clang/test/CXX/expr/p8.cpp
@@ -1,5 +1,4 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// expected-no-diagnostics
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++11
 
 int a0;
 const volatile int a1 = 2;
@@ -16,4 +15,13 @@ int main()
   f0(a1);
   f1(a2);
   f2(a3);
+
+  using IA = int[];
+  void(+IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+  void(*IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+  void(IA{ 1, 2, 3 } + 0); // expected-error {{array prvalue}}
+  void(IA{ 1, 2, 3 } - 0); // expected-error {{array prvalue}}
+  void(0 + IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+  void(0 - IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
+  void(IA{ 1, 2, 3 } - IA{ 1, 2, 3 }); // expected-error {{array prvalue}}
 }

github-actions · 2025-05-20T10:03:14Z

✅ With the latest revision this PR passed the C/C++ code formatter.

zwuis · 2025-05-20T10:49:31Z

This is related to CWG2548. CC @Endilll

languagelawyer · 2025-05-20T10:53:40Z

This is related to CWG2548

Kinda, but not really. It has never been necessary to create that (non-)issue. Or does someone really believe that it is a C++ defect that IA{ 1, 2, 3 } - IA{ 1, 2, 3 } is not allowed? So, it is just a bug since C++11

of unary + and *, and binary + and - operators

AaronBallman

Is this going to break behavior in C? https://godbolt.org/z/PsPPs8ov1

languagelawyer · 2025-05-20T13:47:36Z

Is this going to break behavior in C?

Array prvalues is C++11+ exclusive thing. Compound literals are lvalues in C https://port70.net/~nsz/c/c11/n1570.html#6.5.2.5p4
Anyway, here is a check:

$ build/bin/clang -fsyntax-only test.c
test.c:2:3: warning: expression result unused [-Wunused-value]
    2 |   *((int []){ 1, 2, 3});
      |   ^~~~~~~~~~~~~~~~~~~~~
test.c:3:24: warning: expression result unused [-Wunused-value]
    3 |   ((int []){ 1, 2, 3}) + 0;
      |   ~~~~~~~~~~~~~~~~~~~~ ^ ~
2 warnings generated.

However, it "breaks" compound literal C++ extension:

$ build/bin/clang++ -fsyntax-only test.cxx 
test.cxx:12:2: error: array prvalue is not permitted
   12 |         *((int []){ 1, 2, 3});
      |         ^~~~~~~~~~~~~~~~~~~~~
test.cxx:13:23: error: array prvalue is not permitted
   13 |         ((int []){ 1, 2, 3}) + 0;
      |         ~~~~~~~~~~~~~~~~~~~~ ^
2 errors generated.

but i.g. this is OK/expected

AaronBallman · 2025-05-20T14:12:13Z

Is this going to break behavior in C?

Array prvalues is C++11+ exclusive thing. Compound literals are lvalues in C

Ah, good point, I wasn't remembering that C and C++ are different in how they handle compound literals. And you can return a pointer to an array in C, but that's still an lvalue.

Anyway, here is a check:

$ build/bin/clang -fsyntax-only test.c
test.c:2:3: warning: expression result unused [-Wunused-value]
    2 |   *((int []){ 1, 2, 3});
      |   ^~~~~~~~~~~~~~~~~~~~~
test.c:3:24: warning: expression result unused [-Wunused-value]
    3 |   ((int []){ 1, 2, 3}) + 0;
      |   ~~~~~~~~~~~~~~~~~~~~ ^ ~
2 warnings generated.

Thank you!

AaronBallman · 2025-05-20T15:51:51Z

However, it "breaks" compound literal C++ extension:

$ build/bin/clang++ -fsyntax-only test.cxx 
test.cxx:12:2: error: array prvalue is not permitted
   12 |         *((int []){ 1, 2, 3});
      |         ^~~~~~~~~~~~~~~~~~~~~
test.cxx:13:23: error: array prvalue is not permitted
   13 |         ((int []){ 1, 2, 3}) + 0;
      |         ~~~~~~~~~~~~~~~~~~~~ ^
2 errors generated.

That probably should continue to work -- we accept it today and so does GCC. It's a bit of an oddity, to be sure. But I don't see why it should be rejected either.

languagelawyer · 2025-05-20T16:07:56Z

That probably should continue to work -- we accept it today and so does GCC

GCC started accepting it from version 14, likely when fixing https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94264 (it is mentioned in #54016 (comment)), but it also started accepting IA{ 1, 2, 3 } + 0, which is clearly wrong https://godbolt.org/z/r7Yc3o1qd @pinskia @jicama

I think, if someone wants 100% C compatibility, then they should make compound literals lvalues. If compound literals are prvalues in C++ extension, then they should be treated like other array prvalues

AaronBallman · 2025-05-20T16:16:50Z

That probably should continue to work -- we accept it today and so does GCC

GCC started accepting it from version 14, likely when fixing https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94264 (it is mentioned in #54016 (comment)), but it also started accepting IA{ 1, 2, 3 } + 0, which is clearly wrong https://godbolt.org/z/r7Yc3o1qd @pinskia @jicama

I think, if someone wants 100% C compatibility, then they should make compound literals lvalues. If compound literals are prvalues in C++ extension, then they should be treated like other array prvalues

Ugh, I was backwards again. They're prvalues in C++ already, not lvalues as in C, so yes, it should be rejected in C++ with your changes. Sorry for the confusion!

pinskia · 2025-05-20T17:27:08Z

So reading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53220; it was decided to treat compound literals exactly the same as prvalues arrays in C++ mode of GCC. Yes breaking compatibility with C for the most part.

languagelawyer · 2025-05-20T17:59:53Z

@pinskia the thing is, GCC started accepting ill-formed code right when it was reaffirmed it is ill-formed

Sirraide · 2025-05-21T02:57:01Z

clang/test/CXX/expr/p8.cpp

@@ -16,4 +15,13 @@ int main()
  f0(a1);
  f1(a2);
  f2(a3);
+
+  using IA = int[];
+  void(+IA{ 1, 2, 3 }); // expected-error {{array prvalue}}


Can you add some tests here to make sure that we don’t complain about array glvalues?

Sirraide · 2025-05-21T02:58:47Z

clang/include/clang/Basic/DiagnosticSemaKinds.td

@@ -7639,6 +7639,8 @@ def warn_param_mismatched_alignment : Warning<

 def err_objc_object_assignment : Error<
  "cannot assign to class object (%0 invalid)">;
+def err_typecheck_array_prvalue_operand : Error<
+  "array prvalue is not permitted">;


Suggested change

"array prvalue is not permitted">;

"operand of '%0' cannot be an array prvalue">;

I think it’s a bit clearer if we phrase it like this and also print out the operator rather than just saying ‘X is not permitted’.

But the operator is pointed at by ^, like

test.cxx:13:23: error: array prvalue is not permitted 13 | ((int []){ 1, 2, 3}) + 0; | ~~~~~~~~~~~~~~~~~~~~ ^

Hmm, in that case, maybe just ‘operand cannot be an array prvalue’. I would prefer at least including the word ‘operand’ so it’s clear that the problem is that you’re passing an array prvalue to this operator, because at the moment the diagnostic makes it sound like array prvalues aren’t permitted at all.

But the operator is pointed at by ^

This part can be removed by using -fno-caret-diagnostics command line option. In this case, the diagnostic is as followed:

test.cxx:13:23: error: array prvalue is not permitted

Users may get confused.

@zwuis is "operand cannot be an array prvalue" ok? Comparing to an existing error:

IA{ 1, 2, 3 } + 0.; IA{ 1, 2, 3 } + 0;

test.cxx:5:16: error: invalid operands to binary expression ('int[3]' and 'double') test.cxx:6:16: error: operand cannot be an array prvalue

if you're coming from a C background
What about Python/Java background?

They're not particularly relevant because you don't directly mix Java and C++ code in the same way you do with C and C++ as happens in header files.

IIRC there's no rvalue conversion on the statement expression result so I think that ends up being a prvalue of array type
This is not correct, statement expressions (their "result") undergo array and function pointer decay (independently of the context where they appear), so you're getting a "dangling pointer"

I was incorrectly remembering the behavior from returning a char and not having it promote to an int. I can confirm we decay the array: https://godbolt.org/z/zhW1fnsej but you can hit the same concern I had via a typedef or using, where the type information is actually slightly helpful in understanding the issue: using foo = int[10]; foo{} + 0; (keeping in mind that foo{} could also be behind a macro where it's not easy for the user to spot the {} and realize there's a temporary involved). So I still find a formulation that includes the type information a bit more user-friendly, but that could be included in a new diagnostic that talks about temporary arrays.

I agree with @AaronBallman and @Sirraide that using err_typecheck_invalid_operands is the better approach here.
The "prvalue" bit is only relevant as far as array decay is concerned (eg : array-to-pointer does not apply).
But ultimately, int[] + int is what we should diagnose, not how we got here.

keeping in mind that foo{} could also be behind a macro

I doubt that presenting exact array type vs. just saying "array" would help much in this case. There is note: expanded from macro to point that something comes from macro expansion. (Not shown for err_typecheck_invalid_operands, BTW!)

err_typecheck_invalid_operands is the better approach here

int main() { using IA = int[]; IA ia = { 1, 2, 3 }; ia + 0.; // error: invalid operands to binary expression ('int[3]' and 'double') // Where is the problem? in 'int[3]' or in 'double'? ia + 0; // no error means the issue was in 'double'? IA{ 1, 2, 3 } + 0; // error: invalid operands to binary expression ('int[3]' and 'int') // Huh? What about now? }

But ultimately, int[] + int is what we should diagnose

The types are not the issue, why give misleading diagnostics?

invalid operands to binary expression

BTW, I think it is either "in binary expression" or "to binary operator"

invalid operands to binary expression

BTW, I think it is either "in binary expression" or "to binary operator"

‘an operand to sth.’ is a common turn of phrase that’s been around for a long time (not just in Clang).

Sirraide · 2025-05-21T03:02:05Z

clang/lib/Sema/SemaExpr.cpp

@@ -15840,6 +15859,11 @@ ExprResult Sema::CreateBuiltinUnaryOp(SourceLocation OpLoc,
      InputExpr->getType()->isSpecificBuiltinType(BuiltinType::Dependent)) {
    resultType = Context.DependentTy;
  } else {
+    if (Opc == UO_Deref || Opc == UO_Plus) {


It feels a bit weird to use an if statement here when there’s a switch statement on the same variable right after. I’d probably make this a lambda and then call it in the cases for * and + below (the + case would then also need a [[fallthrough]] annotation).

AaronBallman · 2025-05-21T11:01:45Z

This is related to CWG2548

Kinda, but not really. It has never been necessary to create that (non-)issue. Or does someone really believe that it is a C++ defect that IA{ 1, 2, 3 } - IA{ 1, 2, 3 } is not allowed? So, it is just a bug since C++11

I think it is CWG2548; that's what made it clear that array-to-pointer decay does not happen here despite that being inconsistent with the rest of the language (cplusplus/papers#1633)

languagelawyer · 2025-05-21T11:11:08Z

it is CWG2548; that's what made it clear that array-to-pointer decay does not happen here

CWG2548 is closed as NAD, with no changes to the wording. The fact that array-to-pointer decay does not happen has always been clear

despite that being inconsistent with the rest of the language

I'd say CWG hallucinated inconsistence, to me there is none

AaronBallman · 2025-05-21T11:23:46Z

it is CWG2548; that's what made it clear that array-to-pointer decay does not happen here

CWG2548 is closed as NAD, with no changes to the wording.

Correct.

The fact that array-to-pointer decay does not happen has always been clear

Incorrect; the reason the core issue was filed in the first place was because it seemed like an oversight that the language was inconsistent. Clang was implementing the consistent behavior where array to pointer decay happened. That's why this is CWG2548.

despite that being inconsistent with the rest of the language

I'd say CWG hallucinated inconsistence, to me there is none

int x[10];

(void)(x + 0); // perfectly fine, array to pointer decay on x
(void)((int[10]){} + 0); // Not okay under CWG2548 being closed NAD, no array to pointer decay on the temporary

So definitely inconsistent, and EWG agreed that it was, but liked the accidental behavior that came from it.

languagelawyer · 2025-05-21T12:28:19Z

it seemed like an oversight that the language was inconsistent

Indeed, it only seemed.

Clang was implementing the consistent behavior

Clang is just cutting corners tryna parse C and C++ by the same code using as few lines as possible, so it applied incorrect conversion by batching/applying all of them at once. I think the only motivation of asking for issue instead of fixing the bug was to keep the implementation simpler.

Not okay under CWG2548

Not okay under [expr.add] and [expr]/8

Imagine binary + and * were doing temporary materialization:

const auto& r1 = IA{1, 2, 3}[0]; // extends lifetime of the temporary object
const auto& r2 = (IA{1, 2, 3} + 0); // does not extend
const auto& r3 = *(IA{1, 2, 3} + 0); // does not extend
const auto& r4 = *IA{1, 2, 3}; // does not extend

how this would be consistent? Quite contrary, this would become inconsistent.

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 20, 2025

languagelawyer force-pushed the array_prvalues_as_operands branch 2 times, most recently from 3e262aa to f606724 Compare May 20, 2025 11:50

[Clang][Sema] Reject array prvalue operands

df91056

of unary + and *, and binary + and - operators

languagelawyer force-pushed the array_prvalues_as_operands branch from f606724 to df91056 Compare May 20, 2025 11:55

AaronBallman reviewed May 20, 2025

View reviewed changes

AaronBallman requested review from erichkeane, Endilll and Sirraide May 20, 2025 15:52

Sirraide reviewed May 21, 2025

View reviewed changes

	"array prvalue is not permitted">;
	"operand of '%0' cannot be an array prvalue">;

[Clang][Sema] Reject array prvalue operands #140702

Are you sure you want to change the base?

[Clang][Sema] Reject array prvalue operands #140702

Conversation

languagelawyer commented May 20, 2025

Uh oh!

github-actions bot commented May 20, 2025

Uh oh!

llvmbot commented May 20, 2025

Uh oh!

github-actions bot commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zwuis commented May 20, 2025

Uh oh!

languagelawyer commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

languagelawyer commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AaronBallman commented May 20, 2025

Uh oh!

AaronBallman commented May 20, 2025

Uh oh!

languagelawyer commented May 20, 2025

Uh oh!

AaronBallman commented May 20, 2025

Uh oh!

pinskia commented May 20, 2025

Uh oh!

languagelawyer commented May 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

languagelawyer May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AaronBallman commented May 21, 2025

Uh oh!

languagelawyer commented May 21, 2025

Uh oh!

AaronBallman commented May 21, 2025

Uh oh!

languagelawyer commented May 21, 2025

Uh oh!

Uh oh!

github-actions bot commented May 20, 2025 •

edited

Loading

languagelawyer commented May 20, 2025 •

edited

Loading

languagelawyer commented May 20, 2025 •

edited

Loading

languagelawyer May 21, 2025 •

edited

Loading