This repository is a collection of neat C & C++ trivia and oddities.
0
is technically tokenized as an octal literal.- Array access is commutative:
arr[i]
andi[arr]
are equivalent. This is because array access is defined as a direct translation to*(arr + i)
. sizeof(0)["abcd"]
is1
.- C and C++ grammar allows prototypes in declaration lists:
int a, foo(), * bar(), main();
. https://www.google.com
is a valid line of C/C++ code, but you're limited to one occurrence of each protocol per function.- Operator precedence and associativity is not the same as order of evaluation. The following are all undefined or unspecified behavior:
void foo(int i, int* arr) {
i = i++; // UB
i = i++ + ++i; // UB
arr[i] = i++; // UB
bar(puts("a"), puts("b")); // clang spits out a b, gcc spits out b a
}
https://en.cppreference.com/w/cpp/language/eval_order
- Unknown attributes are ignored without causing an error (since C++17 and C23). This allows all sorts of attribute nonsense (And all of these can of course be applied to variables too):
[[std::vector]] void foo() {} // Yes, even in C
[[code::blocks]] void foo() {}
[[]] void foo() {}
[[,]] void foo() {}
[[]][[]][[]][[]][[]] void foo() {}
[[typedef ::long]] void foo() {}
[[
#include "/proc/cpuinfo"
]] void foo() {}
// C++ only:
[[foo...]] void foo() {}
[[using std:]] void foo() {}
- Attributes may appear almost anywhere in a declaration:
[[foo]] int [[bar]] baz [[biz]] () [[buz]];
[[foo]] constexpr [[bar]] int [[baz]] biz [[buz]] () [[boz]];
// ^ second one is gcc and msvc only, decl-specifier-spec technically prevents an attribute here
- The operand of the
sizeof
operator cannot be a C-style cast.sizeof (int)*p
is parsed as(sizeof(int)) * p
rather thansizeof((int)*p)
. - Precedence is ignored in the conditional operator between
?
and:
:c ? a = 1, y = 2 : foo();
is parsed asc ? (a = 1, y = 2) : foo();
. llU
is a valid (non-user-defined) integer suffix(void)
cast
void foo(int x) {
(void)x; // useful for suppressing unused parameter warnings
// C++ only: (will be a warning with -Wpedantic)
return (void)"You can also return anything from a void function";
}
- You cannot augment a typedef (or
using
alias) withunsigned
:
typedef long long ll
void foo(unsigned ll) {} // unsigned implies unsigned int, ll here is a parameter name
typedef
is a storage class specifier and can appear before, after, or in the middle of a type in a declaration
unsigned typedef int u32;
- Preprocessor directives can be empty:
#include <stdio.h>
#
#
int main() {
#
// ...
}
- Switch statement bodies are allowed to be single statements as opposed to statement sequences (or compound statements), like other control flow structures:
switch(x) case 1: case 2: puts("foo");
- Case labels do not need to be in the top-level statement sequence
int x = 2;
int i = 0;
switch(x) {
default:
if(foo()) {
while(i++ < 5) {
case 2:
puts("lol");
}
}
}
"a" + 1 == ""
can technically evaluate totrue
. As can"a" == "a\0\0"
.- C and C++ support a set of digraph and trigraph tokens to accommodate certain archaic keyboards. Trigraphs were removed from C++ in C++17.
- ISO C forbids conversion between a function and object pointers:
However, if taking the address to the function pointer first, then casting to
void (*func_ptr)() = dlsym(mylib, "func"); // gcc yields a warning with standard C17 in pedantic mode
void**
and finally dereferencing this pointer again, makes it work without warnings:void (*func_ptr)(); *(void**)&func_ptr = dlsym(mylib, "func");
- It's possible to declare multiple functions at once and use typedefs / using decllarations for signatures:
// declares void foo(int); void* baz(float);
void foo(int), * bar(float);
// declares void foo(); void bar();
typedef void fn(); // or using fn = void();
fn foo, bar;
- "
-->
operator", really just a combination of two operators
int x = 10;
while (x --> 0) { // x goes to 0
printf("%d ", x);
}
Syntax | Meaning | Mnemonic |
---|---|---|
-~y | y + 1 | Tadpole swimming toward a value makes it bigger |
~-y | y - 1 | Tadpole swimming away from a value makes it smaller |
- "Unset operator":
x &~ mask
unsetsmask
bits inx
- Boolean identity:
!-!b
0XE+2
should evaluate to16
, however, both gcc and clang give an error:invalid suffix "+2" on integer constant
. Both bugs are known: gcc, clang. MSVC handles it correctly. This may be due to the definition ofpp-number
s and is mentioned in the standard https://eel.is/c++draft/lex.pptoken#example-2.- Clang / LLVM internally can start doing non-multiple of 8 arithmetic in its internal representation (even without the
use of
_ExtInt
or_BitInt
). For example, this code results in 33-bit arithmetic as a result of the optimizer identifying the loop induction.
- The size of an empty struct is
1
. This is because the C++ memory model guarantees disjoint storages (and thus disjoint addresses) for all distinct objects. https://eel.is/c++draft/basic.memobj#intro.object-9.sentence-2 - All types must be deduced the same in an
auto
declarator list. I.e.auto x = 1, y = 1.5;
is not allowed. - What would be idiomatic uses of
malloc
in C are UB in C++ prior to C++23, more details here
struct S { int x; };
S* s = malloc(sizeof(S));
s->x = 1; // an object S hasn't been created and its lifetime hasn't started, placement new is required to make this well-formed
- C++ supports a set of alternative tokens such as
and
,or
,bitand
,compl
, etc. which are equivalent to their primary counterparts. Truly, equivalent:
struct S {
S() = default;
S(const S bitand) = delete;
S(S and) = delete;
compl S() = default;
}
void foo() {
char b[sizeof(S)];
new (&b) S();
((S*)b)->compl S();
}
- Vexing parse:
// Vexing parse: This isn't a variable, it's a function declaration
T foo();
// Most vexing parse: This is still a function declaration (taking a T(*)())
T foo(T());
// "More vexing parse":
T foo(T((()))); // This is also a function declaration taking a T(*)()
T foo(T (((a)))); // this is a function declaration taking a T
// This is a variable definition
T foo((T()));
- C++ structs can have stray semicolons:
struct S { ;;;;; };
- The following is The following code is probably, technically, well-formed in the current working draft of the standard (and may have been before too):
template<typename T> void main(T) {}
int main() {}
This is related to changes in P1787. Sadly, no compiler supports this.
- Function try-blocks are a convenient way to wrap an entire function body with exception handlers and the only way to catch exceptions in member initializer lists:
template<typename T> struct S {
T t;
S(const T& t) try : t(t) {
...
} catch(...) {
...
}
};
noexcept
is both a specifier and operator
void foo() noexcept(noexcept(noexcept(true))) {}
throw()
is the same asnoexcept
since C++17.- You can write
extern "C++"
as well asextern "C"
, these are the only two standard linkage languages, but others can be defined by the implementation. Give usextern "Python"
andextern "Java"
! - A declaration can have arbitrarily many linkage language specifiers:
extern "C" extern "C++" extern "C" extern "C++" void foo(int) {}
The innermost specification is used. https://eel.is/c++draft/dcl.link#5.sentence-2
- The language grammar allows
for
-styleinit-statement
s inswitch
andif
statements, Since C++17:
switch(int x = foo(); t[x]) { ... }
if(auto [a, b, c] = foo(); c) { ... }
// ranged for allows an init-statement too (just no iteration-expression)
for(auto [vec, map] = foo.bar(); const auto& item : vec) { ... }
while
loops do not supportinit-statement
s because that would make them just another for loop.- A
condition
may be a declaration. This allows up to two declarations perswitch
,if
, orfor
statement:
if(int x = foo()) { ... } // intended use
if(auto [a, b, c] = foo(); auto x = bar(a, b, c)) { ... }
for(auto [a, b, c] = foo(); int x = baz(); c++) { ... }
- While an
init-statement
may make array or structured binding declarations,condition
declarations may not. I.e. these are not valid:
if(auto [a, b, c] = foo()) { }
if(int arr[] = {1, 2, 3, 4}) { }
- The following are valid C++ statements:
if(; true) { ... } // empty init-statement
if(false; true) { ... }
if(auto main() -> int; true) { ... }
if(class foobar; true) { ... }
if(typedef int i32; true) { ... }
if(using A = B; true) { ... } // Since C++23
for(struct { int a = 0, b = 100; } s; s.a < s.b; s.a++, s.b--) { ... }
- We cannot, however, do any of the following:
if(static_assert(true); true) { ... }
if(using namespace std; true) { ... }
if(extern "C" int puts(const char*); true) { puts("hello world"); }
if(friend void operator<<(); true) { ... } // syntactically valid, not semantically valid
goto
is disallowed inconstexpr
functions until C++23static
storage local variables are not permitted in constexpr functions until C++23- Structured bindings can't be used in constexpr declarations
- The following is a valid "hello world" implementation
auto& hello_world = std::cout<<"Hello World"<<std::endl;
int main() {}
- A lambda's parameter list can be omitted:
[]{ return 42; }
. (*****+***+**+*+[]{})();
is valid C++. Global operatorsT& operator*(T*)
andT* operator+(T*)
can be used on lambdas with no captures (which decay to function pointers).- The following is valid (since C++23)
[] [[deprecated]] [[deprecated]] {}; // self-deprecating lambda
[[likely]]
can be applied outside of control flow structures:
[[likely]] ;
[[likely]] {};
[[likely]] 1 + 2;
final
,override
,import
, andmodule
aren't keywords but have special meaning in certain contexts. Thus, this is valid C++:
struct final final {
virtual final override() final { return {}; }
};
void final() {
struct final typedef override;
struct final final = override().override();
}
https://eel.is/c++draft/lex.name#2
- There are special rules for lexing
<:
digraphs so thatstd::vector<::std::string>
is lexed correctly and not asstd::vector[:std::string>
:
Otherwise, if the next three characters are <:: and the subsequent character is neither : nor >, the < is treated as a preprocessing token by itself and not as the first character of the alternative token <:
https://eel.is/c++draft/lex.pptoken#3.2
-
std::numeric_limits<T>::max
and related functions are functions because there was originally concern that some values may not be available at compile time. E.g.std::numeric_limits<float>::min
which was dependant on rounding mode. These functions areconstexpr
since C++11 but at that point it was too late to change them from functions. -
The original proposed syntax for lambdas looked like
<>(int x) : [y] (x + y)
(what's now[y](int x) { return x + y; }
).<&>(x) ( x * y )
or<&>(x) -> int { return x * y; }
would have been the syntax for[&](auto x) { return x * y; }
. Also, in the original proposal there was no mutable keyword for lambdas. Instead the call operator was always const and captures were always marked mutable. Initial proposal papers: N1958, N1968, N2329 (N1968 rev 1). -
std::string('0', '0')
is a string of 48'0'
's,std::string{'0', '0'}
is the string"00"
-
James Bond was added to the C++ standard in C++17
-
The C++ standard contains a small poem:
When writing a specialization, be careful about its location; or to make it compile will be such a trial as to kindle its self-immolation.
-
CV qualifiers don't apply to objects their construction is complete, and relatedly there are no cv-qualified constructors
-
Array elements, and objects in general, are always destroyed in reverse order of construction. Standard quote for arrays
-
A lambda's
operator()
is automaticallyconstexpr
if it meets the requirements for a constexpr function https://eel.is/c++draft/expr.prim.lambda.closure#5.sentence-6
decltype(std)
is anint
in gcc. Bug reports: #1, #2.- Prior to gcc 10,
decltype(decltype(decltype))
could be used to generate exponential error messages. typedef int i = 0;
segfaults msvc- This compiles and links in gcc
namespace foobar {
extern "C" int main() {
puts("Hello world!");
}
}
- Compiler can't decide which is correct, both are rejected by gcc:
extern extern "C" extern "C++" int x; // accepted by clang (with warning)
extern "C" extern "C++" extern int x; // accepted by cland (with warning) and msvc (no warning)
GCC is correct. The second is more correct due to linkage-specification
s, but, it's disallowed to
specify a storage class in a linkage-specificaiton
https://eel.is/c++draft/dcl.link#8.sentence-2.
- Double
[[gnu::constructor]]
's are ignored but they are still allowed onmain
so hello world prints twice here.
[[gnu::constructor]] [[gnu::constructor]] int main() {
puts("Hello, World!");
}
- Source code of the very first C compiler.
- An empty struct is UB in C. Standard quote: 6.7.2.1.8 (C11-C23).
- A significant subset of possible identifiers are reserved in C. These include identifiers which
begin with
is
orto
,str
, ormem
followed by a lowercase letter in the global scope. It's undefined to declare/define a one of these reserved identifiers in the global scope. So, the following program may 1) print 1, 2) wipe your hard drive, 3) summon cthulhu, 4) other. All are behaviors are equally correct.
#include <stdio.h>
int iseven(int n) {
return n % 2 == 0;
}
int main() {
printf("%d", iseven(2));
}
- Expressions in parameter declarations are evaluated by gcc/clang. Due to sequencing this prints number 1-10:
#include <stdio.h>
int first = 0;
int main();
int main(int a, char *b[(first++ > 8) ? 1 : main()]) {
printf("%d\n", first--);
}
- Similarly this is a valid "hello world" program in C
int main(int, char*[puts("Hello World")]) {}
auto
is a keyword in C. Not to be confused with C++auto
, Cauto
does absolutely nothing.extern const void x;
is valid a valid declaration in C for the same reasonextern struct S s;
is valid -void
is an incomplete type- This is not valid in C++ because incomplete types in general are not allowed in extern declarations, incomplete class types are specifically explicitly permitted in [dcl.stc]/7
- The following is valid C:
signed _Noreturn const long volatile long static _Atomic inline f(void);
- gcc allows completely empty case labels (C only):
switch(x) { case 1: }
- gcc allows labels to be applied to declarations
switch(x) { default: int y; }
switch(x) { default:; int y; } // must be this in clang
- This compiles without error in TCC
static inline int foo(void) {
[[[[[[[[{{(}));
}
int main(void) {
return _Generic(1, int:0, float:((}}]]]);
}
Some talks about C++ oddities: