-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[clang] Speedup getFileIDLocal with a separate offset table. #146604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang Author: Haojian Wu (hokein) ChangesThe To make the binary search much more cache-efficient, we use a separate offset table. Full diff: https://github.com/llvm/llvm-project/pull/146604.diff 2 Files Affected:
diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h
index eefd4885534c8..345c1fdd3e045 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -719,6 +719,8 @@ class SourceManager : public RefCountedBase<SourceManager> {
/// Positive FileIDs are indexes into this table. Entry 0 indicates an invalid
/// expansion.
SmallVector<SrcMgr::SLocEntry, 0> LocalSLocEntryTable;
+ /// An in-parallel offset table, merely used for speeding up FileID lookup.
+ SmallVector<SourceLocation::UIntTy> LocalLocOffsetTable;
/// The table of SLocEntries that are loaded from other modules.
///
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index 82096421f8579..308992c007183 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -328,6 +328,7 @@ SourceManager::~SourceManager() {
void SourceManager::clearIDTables() {
MainFileID = FileID();
LocalSLocEntryTable.clear();
+ LocalLocOffsetTable.clear();
LoadedSLocEntryTable.clear();
SLocEntryLoaded.clear();
SLocEntryOffsetLoaded.clear();
@@ -636,9 +637,11 @@ FileID SourceManager::createFileIDImpl(ContentCache &File, StringRef Filename,
noteSLocAddressSpaceUsage(Diag);
return FileID();
}
+ assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
LocalSLocEntryTable.push_back(
SLocEntry::get(NextLocalOffset,
FileInfo::get(IncludePos, File, FileCharacter, Filename)));
+ LocalLocOffsetTable.push_back(NextLocalOffset);
// We do a +1 here because we want a SourceLocation that means "the end of the
// file", e.g. for the "no newline at the end of the file" diagnostic.
NextLocalOffset += FileSize + 1;
@@ -690,7 +693,9 @@ SourceManager::createExpansionLocImpl(const ExpansionInfo &Info,
SLocEntryLoaded[Index] = SLocEntryOffsetLoaded[Index] = true;
return SourceLocation::getMacroLoc(LoadedOffset);
}
+ assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
LocalSLocEntryTable.push_back(SLocEntry::get(NextLocalOffset, Info));
+ LocalLocOffsetTable.push_back(NextLocalOffset);
if (NextLocalOffset + Length + 1 <= NextLocalOffset ||
NextLocalOffset + Length + 1 > CurrentLoadedOffset) {
Diag.Report(diag::err_sloc_space_too_large);
@@ -813,7 +818,7 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
assert(SLocOffset < NextLocalOffset && "Bad function choice");
assert(SLocOffset >= LocalSLocEntryTable[0].getOffset() &&
"Invalid SLocOffset");
-
+ assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
// After the first and second level caches, I see two common sorts of
// behavior: 1) a lot of searched FileID's are "near" the cached file
// location or are "near" the cached expansion location. 2) others are just
@@ -831,10 +836,10 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
// SLocOffset.
unsigned LessIndex = 0;
// upper bound of the search range.
- unsigned GreaterIndex = LocalSLocEntryTable.size();
+ unsigned GreaterIndex = LocalLocOffsetTable.size();
if (LastFileIDLookup.ID >= 0) {
// Use the LastFileIDLookup to prune the search space.
- if (LocalSLocEntryTable[LastFileIDLookup.ID].getOffset() < SLocOffset)
+ if (LocalLocOffsetTable[LastFileIDLookup.ID] < SLocOffset)
LessIndex = LastFileIDLookup.ID;
else
GreaterIndex = LastFileIDLookup.ID;
@@ -844,8 +849,8 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
unsigned NumProbes = 0;
while (true) {
--GreaterIndex;
- assert(GreaterIndex < LocalSLocEntryTable.size());
- if (LocalSLocEntryTable[GreaterIndex].getOffset() <= SLocOffset) {
+ assert(GreaterIndex < LocalLocOffsetTable.size());
+ if (LocalLocOffsetTable[GreaterIndex] <= SLocOffset) {
FileID Res = FileID::get(int(GreaterIndex));
// Remember it. We have good locality across FileID lookups.
LastFileIDLookup = Res;
@@ -860,11 +865,7 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
++NumBinaryProbes;
unsigned MiddleIndex = LessIndex + (GreaterIndex - LessIndex) / 2;
-
- SourceLocation::UIntTy MidOffset =
- LocalSLocEntryTable[MiddleIndex].getOffset();
-
- if (MidOffset <= SLocOffset)
+ if (LocalLocOffsetTable[MiddleIndex] <= SLocOffset)
LessIndex = MiddleIndex + 1;
else
GreaterIndex = MiddleIndex;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 suggestion, else LGTM. Please make sure aaron gets a chance to look if he wants.
@@ -719,6 +719,8 @@ class SourceManager : public RefCountedBase<SourceManager> { | |||
/// Positive FileIDs are indexes into this table. Entry 0 indicates an invalid | |||
/// expansion. | |||
SmallVector<SrcMgr::SLocEntry, 0> LocalSLocEntryTable; | |||
/// An in-parallel offset table, merely used for speeding up FileID lookup. | |||
SmallVector<SourceLocation::UIntTy> LocalLocOffsetTable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the size/cache sensitivity of this, might it be better to not use the SmallVector and either use a normal std::vector (no small-vector opt, see the table above does this with the , 0
), or a very large small-vector-opt here? Do we have a good idea what size these usually do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
More cache friendly.
6af564c
to
7b06dab
Compare
The
SLocEntry
structure is 24 bytes, and the binary search only needs the offset. Loading an entry's offset might pull the entire SLocEntry object into the CPU cache.To make the binary search much more cache-efficient, we use a separate offset table.
See https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.