Skip to content

[clang] Speedup getFileIDLocal with a separate offset table. #146604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hokein
Copy link
Collaborator

@hokein hokein commented Jul 1, 2025

The SLocEntry structure is 24 bytes, and the binary search only needs the offset. Loading an entry's offset might pull the entire SLocEntry object into the CPU cache.

To make the binary search much more cache-efficient, we use a separate offset table.

See https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.

@hokein hokein requested review from AaronBallman and erichkeane July 1, 2025 21:30
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jul 1, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 1, 2025

@llvm/pr-subscribers-clang

Author: Haojian Wu (hokein)

Changes

The SLocEntry structure is 24 bytes, and the binary search only needs the offset. Loading an entry's offset might pull the entire SLocEntry object into the CPU cache.

To make the binary search much more cache-efficient, we use a separate offset table.

See https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.


Full diff: https://github.com/llvm/llvm-project/pull/146604.diff

2 Files Affected:

  • (modified) clang/include/clang/Basic/SourceManager.h (+2)
  • (modified) clang/lib/Basic/SourceManager.cpp (+11-10)
diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h
index eefd4885534c8..345c1fdd3e045 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -719,6 +719,8 @@ class SourceManager : public RefCountedBase<SourceManager> {
   /// Positive FileIDs are indexes into this table. Entry 0 indicates an invalid
   /// expansion.
   SmallVector<SrcMgr::SLocEntry, 0> LocalSLocEntryTable;
+  /// An in-parallel offset table, merely used for speeding up FileID lookup.
+  SmallVector<SourceLocation::UIntTy> LocalLocOffsetTable;
 
   /// The table of SLocEntries that are loaded from other modules.
   ///
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index 82096421f8579..308992c007183 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -328,6 +328,7 @@ SourceManager::~SourceManager() {
 void SourceManager::clearIDTables() {
   MainFileID = FileID();
   LocalSLocEntryTable.clear();
+  LocalLocOffsetTable.clear();
   LoadedSLocEntryTable.clear();
   SLocEntryLoaded.clear();
   SLocEntryOffsetLoaded.clear();
@@ -636,9 +637,11 @@ FileID SourceManager::createFileIDImpl(ContentCache &File, StringRef Filename,
     noteSLocAddressSpaceUsage(Diag);
     return FileID();
   }
+  assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
   LocalSLocEntryTable.push_back(
       SLocEntry::get(NextLocalOffset,
                      FileInfo::get(IncludePos, File, FileCharacter, Filename)));
+  LocalLocOffsetTable.push_back(NextLocalOffset);
   // We do a +1 here because we want a SourceLocation that means "the end of the
   // file", e.g. for the "no newline at the end of the file" diagnostic.
   NextLocalOffset += FileSize + 1;
@@ -690,7 +693,9 @@ SourceManager::createExpansionLocImpl(const ExpansionInfo &Info,
     SLocEntryLoaded[Index] = SLocEntryOffsetLoaded[Index] = true;
     return SourceLocation::getMacroLoc(LoadedOffset);
   }
+  assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
   LocalSLocEntryTable.push_back(SLocEntry::get(NextLocalOffset, Info));
+  LocalLocOffsetTable.push_back(NextLocalOffset);
   if (NextLocalOffset + Length + 1 <= NextLocalOffset ||
       NextLocalOffset + Length + 1 > CurrentLoadedOffset) {
     Diag.Report(diag::err_sloc_space_too_large);
@@ -813,7 +818,7 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
   assert(SLocOffset < NextLocalOffset && "Bad function choice");
   assert(SLocOffset >= LocalSLocEntryTable[0].getOffset() &&
          "Invalid SLocOffset");
-
+  assert(LocalSLocEntryTable.size() == LocalLocOffsetTable.size());
   // After the first and second level caches, I see two common sorts of
   // behavior: 1) a lot of searched FileID's are "near" the cached file
   // location or are "near" the cached expansion location. 2) others are just
@@ -831,10 +836,10 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
   // SLocOffset.
   unsigned LessIndex = 0;
   // upper bound of the search range.
-  unsigned GreaterIndex = LocalSLocEntryTable.size();
+  unsigned GreaterIndex = LocalLocOffsetTable.size();
   if (LastFileIDLookup.ID >= 0) {
     // Use the LastFileIDLookup to prune the search space.
-    if (LocalSLocEntryTable[LastFileIDLookup.ID].getOffset() < SLocOffset)
+    if (LocalLocOffsetTable[LastFileIDLookup.ID] < SLocOffset)
       LessIndex = LastFileIDLookup.ID;
     else
       GreaterIndex = LastFileIDLookup.ID;
@@ -844,8 +849,8 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
   unsigned NumProbes = 0;
   while (true) {
     --GreaterIndex;
-    assert(GreaterIndex < LocalSLocEntryTable.size());
-    if (LocalSLocEntryTable[GreaterIndex].getOffset() <= SLocOffset) {
+    assert(GreaterIndex < LocalLocOffsetTable.size());
+    if (LocalLocOffsetTable[GreaterIndex] <= SLocOffset) {
       FileID Res = FileID::get(int(GreaterIndex));
       // Remember it.  We have good locality across FileID lookups.
       LastFileIDLookup = Res;
@@ -860,11 +865,7 @@ FileID SourceManager::getFileIDLocal(SourceLocation::UIntTy SLocOffset) const {
     ++NumBinaryProbes;
 
     unsigned MiddleIndex = LessIndex + (GreaterIndex - LessIndex) / 2;
-
-    SourceLocation::UIntTy MidOffset =
-        LocalSLocEntryTable[MiddleIndex].getOffset();
-
-    if (MidOffset <= SLocOffset)
+    if (LocalLocOffsetTable[MiddleIndex] <= SLocOffset)
       LessIndex = MiddleIndex + 1;
     else
       GreaterIndex = MiddleIndex;

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 suggestion, else LGTM. Please make sure aaron gets a chance to look if he wants.

@@ -719,6 +719,8 @@ class SourceManager : public RefCountedBase<SourceManager> {
/// Positive FileIDs are indexes into this table. Entry 0 indicates an invalid
/// expansion.
SmallVector<SrcMgr::SLocEntry, 0> LocalSLocEntryTable;
/// An in-parallel offset table, merely used for speeding up FileID lookup.
SmallVector<SourceLocation::UIntTy> LocalLocOffsetTable;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the size/cache sensitivity of this, might it be better to not use the SmallVector and either use a normal std::vector (no small-vector opt, see the table above does this with the , 0), or a very large small-vector-opt here? Do we have a good idea what size these usually do?

Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hokein hokein force-pushed the perf/speed-up-get-file-id-local branch from 6af564c to 7b06dab Compare July 4, 2025 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants