Skip to content

[hwasan] Add hwasan-all-globals option #149621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shuffle2
Copy link

@shuffle2 shuffle2 commented Jul 18, 2025

hwasan-globals does not instrument globals with custom sections, because existing code may use __start_/__stop_ symbols to iterate over globals in such a way which will cause hwasan assertions.

Introduce new hwasan-all-globals option, which instruments all user-defined globals (but not those globals which are generated by the hwasan instrumentation itself), including those with custom sections.

fixes #142442

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (shuffle2)

Changes

hwasan-globals does not instrument globals with custom sections, because existing code may use __start_/__stop_ symbols to iterate over globals in such a way which will cause hwasan assertions.

Introduce new hwasan-all-globals option, which instruments all user-defined globals (but not those globals which are generated by the hwasan instrumentation itself), including those with custom sections.

fixes #142442
request review from: @pcc


Full diff: https://github.com/llvm/llvm-project/pull/149621.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp (+22-5)
diff --git a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
index 2c34bf2157cdd..33c2860152214 100644
--- a/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
@@ -160,6 +160,11 @@ static cl::opt<bool> ClGenerateTagsWithCalls(
 static cl::opt<bool> ClGlobals("hwasan-globals", cl::desc("Instrument globals"),
                                cl::Hidden, cl::init(false));
 
+static cl::opt<bool> ClAllGlobals(
+    "hwasan-all-globals",
+    cl::desc("Instrument globals, even those within user-defined sections"),
+    cl::Hidden, cl::init(false));
+
 static cl::opt<int> ClMatchAllTag(
     "hwasan-match-all-tag",
     cl::desc("don't report bad accesses via pointers with this tag"),
@@ -452,6 +457,7 @@ class HWAddressSanitizer {
   bool InstrumentWithCalls;
   bool InstrumentStack;
   bool InstrumentGlobals;
+  bool InstrumentAllGlobals;
   bool DetectUseAfterScope;
   bool UsePageAliases;
   bool UseMatchAllCallback;
@@ -679,6 +685,7 @@ void HWAddressSanitizer::initializeModule() {
 
   InstrumentGlobals =
       !CompileKernel && !UsePageAliases && optOr(ClGlobals, NewRuntime);
+  InstrumentAllGlobals = InstrumentGlobals && ClAllGlobals;
 
   if (!CompileKernel) {
     createHwasanCtorComdat();
@@ -1780,11 +1787,21 @@ void HWAddressSanitizer::instrumentGlobals() {
     if (GV.hasCommonLinkage())
       continue;
 
-    // Globals with custom sections may be used in __start_/__stop_ enumeration,
-    // which would be broken both by adding tags and potentially by the extra
-    // padding/alignment that we insert.
-    if (GV.hasSection())
-      continue;
+    if (InstrumentAllGlobals) {
+      // Avoid adding metadata emitted for the hwasan instrumentation itself.
+      // Code which makes assumptions about memory layout of globals between
+      // __start_<section>/__end_<section> linker-generated symbols may need
+      // manual adaptation.
+      auto section = GV.getSection();
+      if (section == "hwasan_globals" || section == ".note.hwasan.globals")
+        continue;
+    } else {
+      // Globals with custom sections may be used in __start_/__stop_
+      // enumeration, which would be broken both by adding tags and potentially
+      // by the extra padding/alignment that we insert.
+      if (GV.hasSection())
+        continue;
+    }
 
     Globals.push_back(&GV);
   }

@fmayer
Copy link
Contributor

fmayer commented Jul 19, 2025

What do you need this for? This seems like a very easy to misuse flag.

From AsmPrinter.cpp:

  // If the alignment is specified, we *must* obey it.  Overaligning a global
  // with a specified alignment is a prompt way to break globals emitted to
  // sections and expected to be contiguous (e.g. ObjC metadata).

@fmayer
Copy link
Contributor

fmayer commented Jul 19, 2025

Please remove @pcc from the commit message. Otherwise, in case this gets merged, he will be notified whenever this gets merged into some fork, see also: https://discourse.llvm.org/t/forbidding-username-in-commits/86997

@fmayer
Copy link
Contributor

fmayer commented Jul 19, 2025

If you only need this for specific sections, maybe we should add a flag to enumerate the specific allowed sections?

@fmayer fmayer requested a review from pcc July 19, 2025 00:19
@shuffle2
Copy link
Author

The intent of this change is to allow hwasan globals instrumentation to be effective on embedded targets which tend to extensively use custom sections to have fine grained control of data allocation, and mainly/exclusively use global storage instead of dynamically allocated memory.

I don't think listing all sections you want to instrument would be a good solution. A normal user would then need to duplicate all expected section names in some build scripts. Also, some large codebases which this is intended for auto-create section names in ways which would be annoying to manually upkeep in related build scripts (e.g. generated based on __FILE__ and/or __LINE__).

Copy link
Contributor

@fmayer fmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a test. See llvm/test/Instrumentation/HWAddressSanitizer/globals.ll

I am adding a case for a section global #149625

@shuffle2
Copy link
Author

This needs a test. See llvm/test/Instrumentation/HWAddressSanitizer/globals.ll

I am adding a case for a section global #149625

sure, do you intend me to pick those changes into this PR?

@fmayer
Copy link
Contributor

fmayer commented Jul 19, 2025

This needs a test. See llvm/test/Instrumentation/HWAddressSanitizer/globals.ll
I am adding a case for a section global #149625

sure, do you intend me to pick those changes into this PR?

I'll submit it as soon as CI passes. You can base your test changes (an extra RUN line with your flag) on that.

@shuffle2
Copy link
Author

I'll submit it as soon as CI passes. You can base your test changes (an extra RUN line with your flag) on that.

Here's what I came up with based on those changes, does it look OK?

--- a/llvm/test/Instrumentation/HWAddressSanitizer/globals.ll
+++ b/llvm/test/Instrumentation/HWAddressSanitizer/globals.ll
@@ -1,10 +1,12 @@
 ; RUN: opt < %s -S -passes=hwasan -mtriple=aarch64--linux-android29 | FileCheck --check-prefixes=CHECK,CHECK29 %s
 ; RUN: opt < %s -S -passes=hwasan -mtriple=aarch64--linux-android30 | FileCheck --check-prefixes=CHECK,CHECK30 %s
+; RUN: opt < %s -S -passes=hwasan -mtriple=riscv64-unknown-elf -hwasan-globals=1 -hwasan-all-globals=1 | FileCheck --check-prefixes=CHECKALLGLOBALS %s

 ; CHECK29: @four = global

 ; CHECK: @specialcaselisted = global i16 2, no_sanitize_hwaddress
-
+; CHECK: @insection = global i16 2, section "custom"
+; CHECKALLGLOBALS: @insection.hwasan = private global { i16, [14 x i8] } { i16 2, [14 x i8] c"\00\00\00\00\00\00\00\00\00\00\00\00\00\AF" }, section "custom", align 16
 ; CHECK: @__start_hwasan_globals = external hidden constant [0 x i8]
 ; CHECK: @__stop_hwasan_globals = external hidden constant [0 x i8]

@@ -37,3 +39,4 @@ source_filename = "foo"
 @sixteen = global [16 x i8] zeroinitializer
 @huge = global [16777232 x i8] zeroinitializer
 @specialcaselisted = global i16 2, no_sanitize_hwaddress
+@insection = global i16 2, section "custom"

@fmayer
Copy link
Contributor

fmayer commented Jul 19, 2025

You also want to check @specialcaselisted for your flag

@fmayer
Copy link
Contributor

fmayer commented Jul 21, 2025

Thanks for your patch! I will wait to see if @pcc has any thoughts, and if not, go ahead and merge this later this week.

if (ClAllGlobals) {
// Avoid adding metadata emitted for the hwasan instrumentation itself.
// Code which makes assumptions about memory layout of globals between
// __start_<section>/__end_<section> linker-generated symbols may need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/__end_/__stop_/

// __start_<section>/__end_<section> linker-generated symbols may need
// manual adaptation.
auto Section = GV.getSection();
if (Section == "hwasan_globals" || Section == ".note.hwasan.globals")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check even needed if you move the call to instrumentGlobals above createHwasanCtorComdat?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since HWAddressSanitizer::instrumentGlobal creates new globals in section hwasan_globals, it depends if HWAddressSanitizer::instrumentGlobals can have input which has already been instrumented. My (naive) understanding is that this doesn't occur - but maybe it's possible somehow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the sanitizer passes are only run once (it's a bug otherwise). instrumentGlobals produces the list of globals to be instrumented before calling instrumentGlobal so I don't think the new globals should matter.

I was discussing this change with @fmayer and we think it needs to exclude globals in the llvm.metadata section, i.e. synthetic globals created for global constructors, destructors and globals with __attribute__((used)), among other things.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice. I was looking for a flag marking a section (or global) as synthetic/compiler-internal, but didn't see such.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the call to instrumentGlobals above createHwasanCtorComdat

required test response lines to be shuffled around, otherwise seems fine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice some other code does slightly different stuff:

static bool GlobalWasGeneratedByCompiler(GlobalVariable *G) {

if (Section.contains("__llvm") || Section.contains("__LLVM"))

among some others...

should I also check for contains("__llvm") / contains("__LLVM")?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code a bit more, hwasan transform already checks

if (GV.hasSanitizerMetadata() && GV.getSanitizerMetadata().NoHWAddress)
      continue;

which seems like it should have covered the same case I was trying to detect by checking the section name:

  struct SanitizerMetadata {
    SanitizerMetadata()
        : NoAddress(false), NoHWAddress(false),
          Memtag(false), IsDynInit(false) {}
    // For ASan and HWASan, this instrumentation is implicitly applied to all
    // global variables when built with -fsanitize=*. What we need is a way to
    // persist the information that a certain global variable should *not* have
    // sanitizers applied, which occurs if:
    //   1. The global variable is in the sanitizer ignore list, or
    //   2. The global variable is created by the sanitizers itself for internal
    //      usage, or
    //   3. The global variable has __attribute__((no_sanitize("..."))) or
    //      __attribute__((disable_sanitizer_instrumentation)).
    //
    // This is important, a some IR passes like GlobalMerge can delete global
    // variables and replace them with new ones. If the old variables were
    // marked to be unsanitized, then the new ones should also be.
    unsigned NoAddress : 1;
    unsigned NoHWAddress : 1;

(case 2)
it seems like by definition, any new GlobalVariable within the hwasan transform should have NoHWAddress=1 set on it.

furthermore, the hwasan transform pass also already skips the global if GV.getName().starts_with("llvm.").

if (GV.isDeclarationForLinker() || GV.getName().starts_with("llvm.") ||
        GV.isThreadLocal())
      continue;

which catches llvm.used, etc.

is that pre-existing check enough? Or should GV.getSection() == "llvm.metadata" still be checked as well?

https://llvm.org/docs/LangRef.html#intrinsic-global-variables

This section and all globals that start with “llvm.” are reserved for use by LLVM.

A little unclear if this means intrinsic globals can be prefixed with "llvm." and be in a section not named "llvm.metadata", IMO.

hwasan-globals does not instrument globals with custom sections,
because existing code may use __start_/__stop_ symbols to iterate
over globals in such a way which will cause hwasan assertions.

Introduce new hwasan-all-globals option, which instruments all
user-defined globals (but not those globals which are generated by
the hwasan instrumentation itself), including those with custom sections.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hwasan: globals which have been assigned a section cannot be instrumented
4 participants