Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Userspace stack tracing from kernel programs & gnu_debugdata support #466

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

brenns10
Copy link
Contributor

@brenns10 brenns10 commented Feb 5, 2025

This has always been a bit of a far-off idea, but with the module API it's working, some of the time (definitely not all of the time). So I thought I'd share it to see if some of the tweaks necessary to make it happen would be reasonable.

Basically, there's a GDB script called pstack that attaches GDB and takes a stack trace of a program. You can also use /proc/$PID/stack to get the kernel stack (assuming it's running in kernel mode or blocked). I was hoping to come up with a way to replicate that behavior in drgn, in a way that would work against /proc/kcore or /proc/vmcore. Essentially, it would allow you to get userspace stack traces from the crashed kernel (but not necessarily the whole core dump, like contrib/gcore.py) in the kdump kernel just before or after dumping the vmcore. (Presumably, userspace pages are filtered for 99.9% of all kdump configurations, so you'd need to run this while /proc/kcore is still available).

The main part of this requires creating a custom program that has a memory reader, as well as specifying all the required Modules, their biases, and their address ranges. From there, you can get the userspace struct pt_regs from the kernel program, copy it to the user program, and then unwind the stack.

To do this, I've needed to tweak drgn a bit:

  1. Python memory readers may return FaultError, but the resulting drgn_error has the wrong error code. So I added a special-case to pass through fault-errors back to the drgn error. This could be made more general, but I don't actually think it would be good to do that generally.
  2. I made loaded & debug file biases writable, so that I could update the biases.
  3. I included a crude patch to support .gnu_debugdata which is helpful for my use case.
  4. I needed to rip out the compatibility checks for stack tracing, because otherwise drgn would fail saying that stack tracing is not supported for this program.

At the end of the day, I'm quite confident that none of this is ready for merge, but I did wonder if any of the individual changes make sense to include?

For fun, here's the result of running the contrib/pstack.py script against the current bash process:

$ python -m drgn -q --kernel-dir ~/vmlinux_repo/$(uname -r) -k contrib/pstack.py $$
#0  context_switch (kernel/sched/core.c:5328:2)
#1  __schedule (kernel/sched/core.c:6693:8)
#2  __schedule_loop (kernel/sched/core.c:6770:3)
#3  schedule (kernel/sched/core.c:6785:2)
#4  do_wait (kernel/exit.c:1697:3)
#5  kernel_wait4 (kernel/exit.c:1851:8)
#6  __do_sys_wait4 (kernel/exit.c:1879:13)
#7  do_syscall_x64 (arch/x86/entry/common.c:52:14)
#8  do_syscall_64 (arch/x86/entry/common.c:89:7)
#9  entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
#10 0x7ff5ba4d8b7a
------ userspace ---------
#0  wait4+0x1a/0xab
#1  waitchld.constprop.0+0xbb/0xa5f
#2  wait_for+0x4ca/0xc0e
#3  execute_command_internal+0x2768/0x2ef6
#4  execute_command+0xc8/0x1b8
#5  reader_loop+0x289/0x3d9
#6  main+0x15be/0x198b
#7  __libc_start_call_main+0x80/0xac
#8  __libc_start_main@@GLIBC_2.34+0x80/0x148
#9  _start+0x25/0x26
#10 ???

@brenns10 brenns10 marked this pull request as draft March 12, 2025 23:42
@brenns10
Copy link
Contributor Author

Still a draft - the .gnu_debugdata support is still a big hack. Though, if desired, I can yank that out to a separate pull request.

I've dropped the hack to allow setting the file bias, and I've also dropped the change that allows loading non-debug, non-loadable files for "extra modules". Neither are necessary now.

I was able to make drgn get the correct file bias for the main module, but unfortunately I had to load it as an "extra module". Otherwise, drgn mis-computes the file bias (see comment in code). AFAICT, that doesn't impact functionality much.

I still had to tweak the stack trace code, but I didn't rip out/uncomment the checks, just loosened one.

I made a few fixes to the contrib script:

  • It now checks the inode number of the vma and file. That way it can warn the user if a program or shared library has been updated and likely won't be usable. I had actually spent a lot of time trying to figure out why this script wasn't working on so many of my processes, and it turned out that this was the reason 🙃
  • I fixed the detection of the dynamic address for shared libraries.

@brenns10 brenns10 force-pushed the module_tweaks branch 3 times, most recently from bf401ab to aab5ff7 Compare March 13, 2025 20:43
@brenns10 brenns10 changed the title [not ready for merge] Userspace stack tracing from kernel programs Userspace stack tracing from kernel programs & gnu_debugdata support Mar 13, 2025
@brenns10 brenns10 marked this pull request as ready for review March 13, 2025 20:44
@brenns10 brenns10 marked this pull request as draft March 13, 2025 20:48
@brenns10
Copy link
Contributor Author

I've gotten .gnu_debugdata to the point where I'm happy with it. The biggest issue was refactoring the ELF symbol finder so it supports iterating through multiple symbol tables. I did this in a separate commit to make it easier to review the actual .gnu_debugdata changes in isolation from them. It still needs tests, so it's not really ready for review yet -- sorry for the noise.

The current support for ELF symbol tables assumes that there can only be
one table per module. However, symbol tables from .gnu_debugdata are
intended to supplement the dynamic symbol table found in the main
executable. Support for .gnu_debugdata will require that a module may
contain multiple tables that must each be searched.

Signed-off-by: Stephen Brennan <[email protected]>
The .gnu_debugdata section, also known as "MiniDebuginfo"[1], is an ELF
section which contains a second, compressed ELF file. This contained
file typically has all of its data stripped, except for a symbol table.
The symbols contained in this section are meant to be used in addition
to symbols that might be found in the ".dynsym" section.

For distributions like Fedora, this is intended to be used in
combination with .eh_frame information that is provided in the loadable
ELF file, so that stack traces can be automatically created for
userspace crashes, even without debuginfo. To that end, GDB has support
for these sections.

It makes sense for drgn to also include support for these symbol tables,
as we move toward supporting use cases where full DWARF data may not be
available.

Closes osandov#465.

[1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo

Signed-off-by: Stephen Brennan <[email protected]>
A nice way to get full coverage is to repeat the tests for ELF symbols,
but for the fake ELF files, we split the symbols between a .dynsym
symbol table and the .gnu_debugdata.

Signed-off-by: Stephen Brennan <[email protected]>
@brenns10 brenns10 force-pushed the module_tweaks branch 2 times, most recently from f0fd5fd to a83839a Compare March 13, 2025 22:44
@brenns10
Copy link
Contributor Author

Ok, it's been a bit noisy but I now think this is in a good state. I've added testing for .gnu_debugdata by repeating the same ELF symbol tests, but with the symbols split between the loaded file and its .gnu_debugdata.

@brenns10 brenns10 marked this pull request as ready for review March 13, 2025 23:18
Copy link
Owner

@osandov osandov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at "libdrgn: python: pass fault errors through" and "libdrgn: stack_trace: allow unwinding custom programs" to start with, because I'd love to merge those independently of the rest of this. I'll look at the rest later.

brenns10 and others added 4 commits March 14, 2025 15:09
Use the _cleanup_pydecref_ helpers.

Signed-off-by: Stephen Brennan <[email protected]>
Co-authored-by: Omar Sandoval <[email protected]>
Some parts of libdrgn detect drgn error codes and handle them
appropriately. For instance, the stack tracing code expects to get a
fault error. A drgn error that has been translated into a Python
exception, and back to a drgn error, no longer retains its code. This
means that if the stack tracing code is used with Python memory readers,
the fault errors will not be treated as fault errors.

In general, it may not be a good idea to translate every Python
exception back to a drgn error. There may be a small performance cost to
doing so, and what's more: it can be quite useful to know that a Python
error was wrapped into a drgn error. So for now, we'll make this a
special case for FaultErrors, so that custom memory readers will behave
as expected.

Signed-off-by: Stephen Brennan <[email protected]>
Right now, we're a bit conservative in the stack tracing code: we
only allow unwinding userspace cores, or Linux kernel programs. These
are the only two types of programs for which we can get initial
registers. However, if the user provides a pt_regs object, then there's
no reason we can't try to do a stack trace with that.

Move the check for live programs into drgn_get_initial_registers(),
after we've already handled pt_regs.

Signed-off-by: Stephen Brennan <[email protected]>
This script is definitely not perfect: it assumes that the task is
running in the same filesystem namespace as this script. It also cannot
do much to handle unwinding through JIT compiled code, or anything which
doesn't have a file mapping for which we can find debuginfo (though
usually, this sort of code uses frame pointers). It tries its best to
avoid getting misled by files which have changed on-disk (e.g. if a
shared library is updated by your package manager).

Signed-off-by: Stephen Brennan <[email protected]>
@brenns10
Copy link
Contributor Author

Feel free to cherry-pick what you want if you're happy with those commits and I can always drop them in the rebase.

Copy link
Owner

@osandov osandov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I cherry-picked the error handling patches. I have one question about "libdrgn: stack_trace: allow unwinding anything with a pt_regs" before I cherry-pick that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants