-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[DenseMap] Do not align pointer sentinel values (NFC) #146595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-llvm-adt Author: Nikita Popov (nikic) ChangesDenseMapInfo for pointers currently uses empty/tombstone values that are aligned (by assuming a very conservative alignment). However, this means that we have to work with larger immediates. This patch proposes to use the values -1 and -2 instead, without caring about pointer alignment. This puts us into unspecified territory from the perspective of the C standard, but general implementer consensus (including Clang) is that raw pointers do not carry alignment guarantees, only memory accesses do. We already have lots of places that rely on this using variations on This is a small improvement for both compile-time and clang binary size: https://llvm-compile-time-tracker.com/compare.php?from=b8b74945514358f1ebeac0874172f37fc7bfa38e&to=3c83bf05608d720d6ff88adfaca7191331eeb30f&stat=instructions:u Full diff: https://github.com/llvm/llvm-project/pull/146595.diff 2 Files Affected:
diff --git a/llvm/include/llvm/ADT/DenseMapInfo.h b/llvm/include/llvm/ADT/DenseMapInfo.h
index 07c37e353a40b..b371611e7d948 100644
--- a/llvm/include/llvm/ADT/DenseMapInfo.h
+++ b/llvm/include/llvm/ADT/DenseMapInfo.h
@@ -56,30 +56,13 @@ struct DenseMapInfo {
//static bool isEqual(const T &LHS, const T &RHS);
};
-// Provide DenseMapInfo for all pointers. Come up with sentinel pointer values
-// that are aligned to alignof(T) bytes, but try to avoid requiring T to be
-// complete. This allows clients to instantiate DenseMap<T*, ...> with forward
-// declared key types. Assume that no pointer key type requires more than 4096
-// bytes of alignment.
-template<typename T>
-struct DenseMapInfo<T*> {
- // The following should hold, but it would require T to be complete:
- // static_assert(alignof(T) <= (1 << Log2MaxAlign),
- // "DenseMap does not support pointer keys requiring more than "
- // "Log2MaxAlign bits of alignment");
- static constexpr uintptr_t Log2MaxAlign = 12;
-
+template <typename T> struct DenseMapInfo<T *> {
static inline T* getEmptyKey() {
- uintptr_t Val = static_cast<uintptr_t>(-1);
- Val <<= Log2MaxAlign;
- return reinterpret_cast<T*>(Val);
+ // We assume that raw pointers do not carry alignment requirements.
+ return reinterpret_cast<T *>(-1);
}
- static inline T* getTombstoneKey() {
- uintptr_t Val = static_cast<uintptr_t>(-2);
- Val <<= Log2MaxAlign;
- return reinterpret_cast<T*>(Val);
- }
+ static inline T *getTombstoneKey() { return reinterpret_cast<T *>(-2); }
static unsigned getHashValue(const T *PtrVal) {
return (unsigned((uintptr_t)PtrVal) >> 4) ^
diff --git a/llvm/include/llvm/ADT/PointerUnion.h b/llvm/include/llvm/ADT/PointerUnion.h
index cdbd76d7f505b..86248a2cf836f 100644
--- a/llvm/include/llvm/ADT/PointerUnion.h
+++ b/llvm/include/llvm/ADT/PointerUnion.h
@@ -286,13 +286,19 @@ struct PointerLikeTypeTraits<PointerUnion<PTs...>> {
// Teach DenseMap how to use PointerUnions as keys.
template <typename ...PTs> struct DenseMapInfo<PointerUnion<PTs...>> {
using Union = PointerUnion<PTs...>;
- using FirstInfo =
- DenseMapInfo<typename pointer_union_detail::GetFirstType<PTs...>::type>;
+ using FirstTypeTraits = PointerLikeTypeTraits<
+ typename pointer_union_detail::GetFirstType<PTs...>::type>;
- static inline Union getEmptyKey() { return Union(FirstInfo::getEmptyKey()); }
+ static inline Union getEmptyKey() {
+ uintptr_t Val = static_cast<uintptr_t>(-1);
+ Val <<= FirstTypeTraits::NumLowBitsAvailable;
+ return FirstTypeTraits::getFromVoidPointer(reinterpret_cast<void *>(Val));
+ }
static inline Union getTombstoneKey() {
- return Union(FirstInfo::getTombstoneKey());
+ uintptr_t Val = static_cast<uintptr_t>(-2);
+ Val <<= FirstTypeTraits::NumLowBitsAvailable;
+ return FirstTypeTraits::getFromVoidPointer(reinterpret_cast<void *>(Val));
}
static unsigned getHashValue(const Union &UnionVal) {
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if this runs into issues with undefined behavior? I don't know the exact rules, but remember reading something along the lines that some OOB pointers cannot even be compared without running into UB. Having pointers that don't respect type alignment sounds like walking on thin ice to me...
I think the relevant parts are https://eel.is/c++draft/expr.eq#3 (which does not specify UB for equality comparisons) and https://eel.is/c++draft/expr.reinterpret.cast#5 (which makes all non-roundtrip integer to pointer reinterpret_casts implementation-defined.) I think there are some more complexities here, but I think the gist is that as soon as we do a reinterpret_cast from an integer to a pointer we're in implementation-defined territory regardless of whether the resulting pointer is aligned or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG
DenseMapInfo for pointers currently uses empty/tombstone values that are aligned (by assuming a very conservative alignment). However, this means that we have to work with large immediates. This patch proposed to use the values -1 and -2 instead, without caring about pointer alignment. This puts us into unspecified territory from the perspective of the C standard, but it is Clang's explicit position that raw pointers do not carry alignment guarantees (only accesses do). We already have lots of places that do variations on `reinterpret_cast<T*>(-1)` and assume that to work, so I'm not sure it's worthwhile to be strictly standards conforming in this single place.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/16366 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/12913 Here is the relevant piece of the build log for the reference
|
This reverts commit 7a6435b. This causes ubsan failures when the sentinel pointers are upcast using static_cast<>, which checks alignment.
Reverted this due to ubsan errors like:
The problem is basically code like this:
Where ubsan doesn't like the static_cast. While DenseMap itself doesn't need the aligned pointers, it seems like too many other places end up depending on this. |
Good catch, now we have an example of what I suspected might be an issue in #146595 (review) |
DenseMapInfo for pointers currently uses empty/tombstone values that are aligned (by assuming a very conservative alignment). However, this means that we have to work with larger immediates.
This patch proposes to use the values -1 and -2 instead, without caring about pointer alignment. (Non-roundtrip) integer to pointer casts are implementation-defined in C++, but the general implementer consensus (including Clang) is that raw pointers do not carry alignment requirements, only memory accesses do.
We already have lots of places that rely on this using variations on
reinterpret_cast<T*>(-1)
, so it seems odd to insist on properly aligned pointers in this one place.It is necessary to adjust a few other places after this change, which currently assume that
DenseMapInfo<void *>
returns a highly-aligned pointer.This is a small improvement for both compile-time and clang binary size: https://llvm-compile-time-tracker.com/compare.php?from=b8b74945514358f1ebeac0874172f37fc7bfa38e&to=3c83bf05608d720d6ff88adfaca7191331eeb30f&stat=instructions:u