Skip to content

[BOLT] Drop perf2bolt cold samples diagnostic #139337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/aaupov/spr/main.bolt-drop-perf2bolt-cold-samples-diagnostic
Choose a base branch
from

Conversation

aaupov
Copy link
Contributor

@aaupov aaupov commented May 9, 2025

Cold samples diagnostics in perf2bolt are superseded by
print-heatmap-stats option. It provides a superset of stats and works
without BAT section (not emitted by default).

Test Plan: NFC

Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented May 9, 2025

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

Cold samples diagnostics in perf2bolt are superseded by
print-heatmap-stats option. It provides a superset of stats and works
without BAT section (not emitted by default).

Test Plan: NFC


Full diff: https://github.com/llvm/llvm-project/pull/139337.diff

2 Files Affected:

  • (modified) bolt/include/bolt/Profile/DataAggregator.h (-6)
  • (modified) bolt/lib/Profile/DataAggregator.cpp (+3-26)
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index d66d198e37d61..3cec88437d164 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -212,11 +212,6 @@ class DataAggregator : public DataReader {
   uint64_t NumTraces{0};
   uint64_t NumInvalidTraces{0};
   uint64_t NumLongRangeTraces{0};
-  /// Specifies how many samples were recorded in cold areas if we are dealing
-  /// with profiling data collected in a bolted binary. For LBRs, incremented
-  /// for the source of the branch to avoid counting cold activity twice (one
-  /// for source and another for destination).
-  uint64_t NumColdSamples{0};
   uint64_t NumTotalSamples{0};
 
   /// Looks into system PATH for Linux Perf and set up the aggregator to use it
@@ -468,7 +463,6 @@ class DataAggregator : public DataReader {
   void dump(const PerfMemSample &Sample) const;
 
   /// Profile diagnostics print methods
-  void printColdSamplesDiagnostic() const;
   void printLongRangeTracesDiagnostic() const;
   void printBranchSamplesDiagnostics() const;
   void printBasicSamplesDiagnostics(uint64_t OutOfRangeSamples) const;
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index a47bba296c137..e9b9276407151 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -644,8 +644,6 @@ bool DataAggregator::doSample(BinaryFunction &OrigFunc, uint64_t Address,
 
   BinaryFunction *ParentFunc = getBATParentFunction(OrigFunc);
   BinaryFunction &Func = ParentFunc ? *ParentFunc : OrigFunc;
-  if (ParentFunc || (BAT && !BAT->isBATFunction(Func.getAddress())))
-    NumColdSamples += Count;
   // Attach executed bytes to parent function in case of cold fragment.
   Func.SampleCountInBytes += Count * BlockSize;
 
@@ -749,15 +747,10 @@ bool DataAggregator::doBranch(uint64_t From, uint64_t To, uint64_t Count,
     if (BAT)
       Addr = BAT->translate(Func->getAddress(), Addr, IsFrom);
 
-    BinaryFunction *ParentFunc = getBATParentFunction(*Func);
-    if (IsFrom &&
-        (ParentFunc || (BAT && !BAT->isBATFunction(Func->getAddress()))))
-      NumColdSamples += Count;
+    if (BinaryFunction *ParentFunc = getBATParentFunction(*Func))
+      Func = ParentFunc;
 
-    if (!ParentFunc)
-      return std::pair{Func, IsRet};
-
-    return std::pair{ParentFunc, IsRet};
+    return std::pair{Func, IsRet};
   };
 
   auto [FromFunc, IsReturn] = handleAddress(From, /*IsFrom*/ true);
@@ -1452,20 +1445,6 @@ void DataAggregator::parseLBRSample(const PerfBranchSample &Sample,
   }
 }
 
-void DataAggregator::printColdSamplesDiagnostic() const {
-  if (NumColdSamples > 0) {
-    const float ColdSamples = NumColdSamples * 100.0f / NumTotalSamples;
-    outs() << "PERF2BOLT: " << NumColdSamples
-           << format(" (%.1f%%)", ColdSamples)
-           << " samples recorded in cold regions of split functions.\n";
-    if (ColdSamples > 5.0f)
-      outs()
-          << "WARNING: The BOLT-processed binary where samples were collected "
-             "likely used bad data or your service observed a large shift in "
-             "profile. You may want to audit this\n";
-  }
-}
-
 void DataAggregator::printLongRangeTracesDiagnostic() const {
   outs() << "PERF2BOLT: out of range traces involving unknown regions: "
          << NumLongRangeTraces;
@@ -1506,7 +1485,6 @@ void DataAggregator::printBranchSamplesDiagnostics() const {
               "collection. The generated data may be ineffective for improving "
               "performance\n\n";
   printLongRangeTracesDiagnostic();
-  printColdSamplesDiagnostic();
 }
 
 void DataAggregator::printBasicSamplesDiagnostics(
@@ -1518,7 +1496,6 @@ void DataAggregator::printBasicSamplesDiagnostics(
               "binary is probably not the same binary used during profiling "
               "collection. The generated data may be ineffective for improving "
               "performance\n\n";
-  printColdSamplesDiagnostic();
 }
 
 void DataAggregator::printBranchStacksDiagnostics(

@fivemeyestore
Copy link

This is a well-structured codebase. I appreciate the clean separation of concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants