Skip to content

Conversation

itrofimow
Copy link
Contributor

This patch tweaks symtab processing logic in llvm-gsymutil conversion
by allowing unknown symbols to get into final GSYM, since such symbols
are common with handwritten assembly in the wild (lack ".type function" directive)

@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2024

@llvm/pr-subscribers-debuginfo

Author: None (itrofimow)

Changes

This patch tweaks symtab processing logic in llvm-gsymutil conversion
by allowing unknown symbols to get into final GSYM, since such symbols
are common with handwritten assembly in the wild (lack ".type function" directive)


Full diff: https://github.com/llvm/llvm-project/pull/119307.diff

1 Files Affected:

  • (modified) llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp (+7-1)
diff --git a/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp b/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
index 122de4deea5dfd..d5059ce34f5221 100644
--- a/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
+++ b/llvm/lib/DebugInfo/GSYM/ObjectFileTransformer.cpp
@@ -90,7 +90,10 @@ llvm::Error ObjectFileTransformer::convert(const object::ObjectFile &Obj,
       // TODO: Test this error.
       return AddrOrErr.takeError();
 
-    if (SymType.get() != SymbolRef::Type::ST_Function ||
+    if ((SymType.get() != SymbolRef::Type::ST_Function &&
+         // We allow unknown (yet with valid text address) symbols,
+         // since these are common with handwritten assembly in the wild.
+         SymType.get() != SymbolRef::Type::ST_Unknown) ||
         !Gsym.IsValidTextAddress(*AddrOrErr))
       continue;
     // Function size for MachO files will be 0
@@ -105,6 +108,9 @@ llvm::Error ObjectFileTransformer::convert(const object::ObjectFile &Obj,
         consumeError(Name.takeError());
       continue;
     }
+    // Could happen with ST_Unknown symbols.
+    if (Name->empty())
+      continue;
     // Remove the leading '_' character in any symbol names if there is one
     // for mach-o files.
     if (IsMachO)

@itrofimow
Copy link
Contributor Author

itrofimow commented Dec 10, 2024

This aims to fix the following problem:
given a function FZERO, which somehow ends up having a zero size in resulting GSYM,
and a function FUNKNOWN, which instructions immediately follow FZERO and which symbol is of ST_Unknown type in symtab,
all lookups of FUNKNOWN addresses resolve into FZERO due to

  1. FUNKNOWN not being present in the resulting GSYM (which this patch aims to change)
  2. FZERO being found by binary search and this logic
    // Make sure the current function address ranges contains \a Addr.
    // Some symbols on Darwin don't have valid sizes, so if we run into a
    // symbol with zero size, then we have found a match for our address.

@itrofimow itrofimow changed the title [GSYM] Allow executable symtab symbols with unknown type [GSYM] Allow converting executable symtab symbols with unknown type Dec 10, 2024
@itrofimow
Copy link
Contributor Author

Hi @clayborg !
I see you have recently reviewed some other GSYM-related PRs, could you please have a look at this one?

@itrofimow
Copy link
Contributor Author

Hi @alx32! I see you've merged some GSYM-related pull requests recently, could you please give this one a look?

@alx32
Copy link
Contributor

alx32 commented Feb 7, 2025

@itrofimow any reason not to include a test for this ?

@itrofimow
Copy link
Contributor Author

Closing in favor of #147332

@itrofimow itrofimow closed this Jul 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants