Userspace stack tracing from kernel programs & gnu_debugdata support #466

brenns10 · 2025-02-05T23:36:44Z

This has always been a bit of a far-off idea, but with the module API it's working, some of the time (definitely not all of the time). So I thought I'd share it to see if some of the tweaks necessary to make it happen would be reasonable.

Basically, there's a GDB script called pstack that attaches GDB and takes a stack trace of a program. You can also use /proc/$PID/stack to get the kernel stack (assuming it's running in kernel mode or blocked). I was hoping to come up with a way to replicate that behavior in drgn, in a way that would work against /proc/kcore or /proc/vmcore. Essentially, it would allow you to get userspace stack traces from the crashed kernel (but not necessarily the whole core dump, like contrib/gcore.py) in the kdump kernel just before or after dumping the vmcore. (Presumably, userspace pages are filtered for 99.9% of all kdump configurations, so you'd need to run this while /proc/kcore is still available).

The main part of this requires creating a custom program that has a memory reader, as well as specifying all the required Modules, their biases, and their address ranges. From there, you can get the userspace struct pt_regs from the kernel program, copy it to the user program, and then unwind the stack.

To do this, I've needed to tweak drgn a bit:

Python memory readers may return FaultError, but the resulting drgn_error has the wrong error code. So I added a special-case to pass through fault-errors back to the drgn error. This could be made more general, but I don't actually think it would be good to do that generally.
I made loaded & debug file biases writable, so that I could update the biases.
I included a crude patch to support .gnu_debugdata which is helpful for my use case.
I needed to rip out the compatibility checks for stack tracing, because otherwise drgn would fail saying that stack tracing is not supported for this program.

At the end of the day, I'm quite confident that none of this is ready for merge, but I did wonder if any of the individual changes make sense to include?

For fun, here's the result of running the contrib/pstack.py script against the current bash process:

$ python -m drgn -q --kernel-dir ~/vmlinux_repo/$(uname -r) -k contrib/pstack.py $$
#0  context_switch (kernel/sched/core.c:5328:2)
#1  __schedule (kernel/sched/core.c:6693:8)
#2  __schedule_loop (kernel/sched/core.c:6770:3)
#3  schedule (kernel/sched/core.c:6785:2)
#4  do_wait (kernel/exit.c:1697:3)
#5  kernel_wait4 (kernel/exit.c:1851:8)
#6  __do_sys_wait4 (kernel/exit.c:1879:13)
#7  do_syscall_x64 (arch/x86/entry/common.c:52:14)
#8  do_syscall_64 (arch/x86/entry/common.c:89:7)
#9  entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
#10 0x7ff5ba4d8b7a
------ userspace ---------
#0  wait4+0x1a/0xab
#1  waitchld.constprop.0+0xbb/0xa5f
#2  wait_for+0x4ca/0xc0e
#3  execute_command_internal+0x2768/0x2ef6
#4  execute_command+0xc8/0x1b8
#5  reader_loop+0x289/0x3d9
#6  main+0x15be/0x198b
#7  __libc_start_call_main+0x80/0xac
#8  __libc_start_main@@GLIBC_2.34+0x80/0x148
#9  _start+0x25/0x26
#10 ???

brenns10 · 2025-03-12T23:59:58Z

Still a draft - the .gnu_debugdata support is still a big hack. Though, if desired, I can yank that out to a separate pull request.

I've dropped the hack to allow setting the file bias, and I've also dropped the change that allows loading non-debug, non-loadable files for "extra modules". Neither are necessary now.

I was able to make drgn get the correct file bias for the main module, but unfortunately I had to load it as an "extra module". Otherwise, drgn mis-computes the file bias (see comment in code). AFAICT, that doesn't impact functionality much.

I still had to tweak the stack trace code, but I didn't rip out/uncomment the checks, just loosened one.

I made a few fixes to the contrib script:

It now checks the inode number of the vma and file. That way it can warn the user if a program or shared library has been updated and likely won't be usable. I had actually spent a lot of time trying to figure out why this script wasn't working on so many of my processes, and it turned out that this was the reason 🙃
I fixed the detection of the dynamic address for shared libraries.

brenns10 · 2025-03-13T20:50:52Z

I've gotten .gnu_debugdata to the point where I'm happy with it. The biggest issue was refactoring the ELF symbol finder so it supports iterating through multiple symbol tables. I did this in a separate commit to make it easier to review the actual .gnu_debugdata changes in isolation from them. It still needs tests, so it's not really ready for review yet -- sorry for the noise.

The current support for ELF symbol tables assumes that there can only be one table per module. However, symbol tables from .gnu_debugdata are intended to supplement the dynamic symbol table found in the main executable. Support for .gnu_debugdata will require that a module may contain multiple tables that must each be searched. Signed-off-by: Stephen Brennan <[email protected]>

The .gnu_debugdata section, also known as "MiniDebuginfo"[1], is an ELF section which contains a second, compressed ELF file. This contained file typically has all of its data stripped, except for a symbol table. The symbols contained in this section are meant to be used in addition to symbols that might be found in the ".dynsym" section. For distributions like Fedora, this is intended to be used in combination with .eh_frame information that is provided in the loadable ELF file, so that stack traces can be automatically created for userspace crashes, even without debuginfo. To that end, GDB has support for these sections. It makes sense for drgn to also include support for these symbol tables, as we move toward supporting use cases where full DWARF data may not be available. Closes osandov#465. [1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo Signed-off-by: Stephen Brennan <[email protected]>

A nice way to get full coverage is to repeat the tests for ELF symbols, but for the fake ELF files, we split the symbols between a .dynsym symbol table and the .gnu_debugdata. Signed-off-by: Stephen Brennan <[email protected]>

brenns10 · 2025-03-13T22:50:49Z

Ok, it's been a bit noisy but I now think this is in a good state. I've added testing for .gnu_debugdata by repeating the same ELF symbol tests, but with the symbols split between the loaded file and its .gnu_debugdata.

osandov

I looked at "libdrgn: python: pass fault errors through" and "libdrgn: stack_trace: allow unwinding custom programs" to start with, because I'd love to merge those independently of the rest of this. I'll look at the rest later.

libdrgn/python/error.c

libdrgn/stack_trace.c

Use the _cleanup_pydecref_ helpers. Signed-off-by: Stephen Brennan <[email protected]> Co-authored-by: Omar Sandoval <[email protected]>

Some parts of libdrgn detect drgn error codes and handle them appropriately. For instance, the stack tracing code expects to get a fault error. A drgn error that has been translated into a Python exception, and back to a drgn error, no longer retains its code. This means that if the stack tracing code is used with Python memory readers, the fault errors will not be treated as fault errors. In general, it may not be a good idea to translate every Python exception back to a drgn error. There may be a small performance cost to doing so, and what's more: it can be quite useful to know that a Python error was wrapped into a drgn error. So for now, we'll make this a special case for FaultErrors, so that custom memory readers will behave as expected. Signed-off-by: Stephen Brennan <[email protected]>

Right now, we're a bit conservative in the stack tracing code: we only allow unwinding userspace cores, or Linux kernel programs. These are the only two types of programs for which we can get initial registers. However, if the user provides a pt_regs object, then there's no reason we can't try to do a stack trace with that. Move the check for live programs into drgn_get_initial_registers(), after we've already handled pt_regs. Signed-off-by: Stephen Brennan <[email protected]>

This script is definitely not perfect: it assumes that the task is running in the same filesystem namespace as this script. It also cannot do much to handle unwinding through JIT compiled code, or anything which doesn't have a file mapping for which we can find debuginfo (though usually, this sort of code uses frame pointers). It tries its best to avoid getting misled by files which have changed on-disk (e.g. if a shared library is updated by your package manager). Signed-off-by: Stephen Brennan <[email protected]>

brenns10 · 2025-03-14T22:20:47Z

Feel free to cherry-pick what you want if you're happy with those commits and I can always drop them in the rebase.

osandov

Thanks, I cherry-picked the error handling patches. I have one question about "libdrgn: stack_trace: allow unwinding anything with a pt_regs" before I cherry-pick that one.

libdrgn/stack_trace.c

brenns10 force-pushed the module_tweaks branch from d367210 to fd635c6 Compare March 12, 2025 23:41

brenns10 marked this pull request as draft March 12, 2025 23:42

brenns10 force-pushed the module_tweaks branch 3 times, most recently from bf401ab to aab5ff7 Compare March 13, 2025 20:43

brenns10 changed the title ~~[not ready for merge] Userspace stack tracing from kernel programs~~ Userspace stack tracing from kernel programs & gnu_debugdata support Mar 13, 2025

brenns10 marked this pull request as ready for review March 13, 2025 20:44

brenns10 marked this pull request as draft March 13, 2025 20:48

brenns10 force-pushed the module_tweaks branch from aab5ff7 to 36efe90 Compare March 13, 2025 22:23

brenns10 added 3 commits March 13, 2025 15:24

tests: Add test for .gnu_debugdata symbols

b694a28

A nice way to get full coverage is to repeat the tests for ELF symbols, but for the fake ELF files, we split the symbols between a .dynsym symbol table and the .gnu_debugdata. Signed-off-by: Stephen Brennan <[email protected]>

brenns10 force-pushed the module_tweaks branch 2 times, most recently from f0fd5fd to a83839a Compare March 13, 2025 22:44

brenns10 marked this pull request as ready for review March 13, 2025 23:18

osandov requested changes Mar 14, 2025

View reviewed changes

libdrgn/python/error.c Outdated Show resolved Hide resolved

libdrgn/python/error.c Outdated Show resolved Hide resolved

libdrgn/python/error.c Outdated Show resolved Hide resolved

libdrgn/stack_trace.c Outdated Show resolved Hide resolved

brenns10 and others added 4 commits March 14, 2025 15:09

libdrgn/python: modernize drgn_error_from_python()

56bc964

Use the _cleanup_pydecref_ helpers. Signed-off-by: Stephen Brennan <[email protected]> Co-authored-by: Omar Sandoval <[email protected]>

brenns10 force-pushed the module_tweaks branch from a83839a to 40f809c Compare March 14, 2025 22:10

osandov reviewed Mar 17, 2025

View reviewed changes

libdrgn/stack_trace.c Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Userspace stack tracing from kernel programs & gnu_debugdata support #466

Userspace stack tracing from kernel programs & gnu_debugdata support #466

brenns10 commented Feb 5, 2025

brenns10 commented Mar 12, 2025

brenns10 commented Mar 13, 2025

brenns10 commented Mar 13, 2025

osandov left a comment

brenns10 commented Mar 14, 2025

osandov left a comment

Userspace stack tracing from kernel programs & gnu_debugdata support #466

Are you sure you want to change the base?

Userspace stack tracing from kernel programs & gnu_debugdata support #466

Conversation

brenns10 commented Feb 5, 2025

brenns10 commented Mar 12, 2025

brenns10 commented Mar 13, 2025

brenns10 commented Mar 13, 2025

osandov left a comment

Choose a reason for hiding this comment

brenns10 commented Mar 14, 2025

osandov left a comment

Choose a reason for hiding this comment