-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Userspace stack tracing from kernel programs & gnu_debugdata support #466
base: main
Are you sure you want to change the base?
Conversation
Still a draft - the I've dropped the hack to allow setting the file bias, and I've also dropped the change that allows loading non-debug, non-loadable files for "extra modules". Neither are necessary now. I was able to make drgn get the correct file bias for the main module, but unfortunately I had to load it as an "extra module". Otherwise, drgn mis-computes the file bias (see comment in code). AFAICT, that doesn't impact functionality much. I still had to tweak the stack trace code, but I didn't rip out/uncomment the checks, just loosened one. I made a few fixes to the contrib script:
|
bf401ab
to
aab5ff7
Compare
I've gotten |
The current support for ELF symbol tables assumes that there can only be one table per module. However, symbol tables from .gnu_debugdata are intended to supplement the dynamic symbol table found in the main executable. Support for .gnu_debugdata will require that a module may contain multiple tables that must each be searched. Signed-off-by: Stephen Brennan <[email protected]>
The .gnu_debugdata section, also known as "MiniDebuginfo"[1], is an ELF section which contains a second, compressed ELF file. This contained file typically has all of its data stripped, except for a symbol table. The symbols contained in this section are meant to be used in addition to symbols that might be found in the ".dynsym" section. For distributions like Fedora, this is intended to be used in combination with .eh_frame information that is provided in the loadable ELF file, so that stack traces can be automatically created for userspace crashes, even without debuginfo. To that end, GDB has support for these sections. It makes sense for drgn to also include support for these symbol tables, as we move toward supporting use cases where full DWARF data may not be available. Closes osandov#465. [1]: https://fedoraproject.org/wiki/Features/MiniDebugInfo Signed-off-by: Stephen Brennan <[email protected]>
A nice way to get full coverage is to repeat the tests for ELF symbols, but for the fake ELF files, we split the symbols between a .dynsym symbol table and the .gnu_debugdata. Signed-off-by: Stephen Brennan <[email protected]>
f0fd5fd
to
a83839a
Compare
Ok, it's been a bit noisy but I now think this is in a good state. I've added testing for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at "libdrgn: python: pass fault errors through" and "libdrgn: stack_trace: allow unwinding custom programs" to start with, because I'd love to merge those independently of the rest of this. I'll look at the rest later.
Use the _cleanup_pydecref_ helpers. Signed-off-by: Stephen Brennan <[email protected]> Co-authored-by: Omar Sandoval <[email protected]>
Some parts of libdrgn detect drgn error codes and handle them appropriately. For instance, the stack tracing code expects to get a fault error. A drgn error that has been translated into a Python exception, and back to a drgn error, no longer retains its code. This means that if the stack tracing code is used with Python memory readers, the fault errors will not be treated as fault errors. In general, it may not be a good idea to translate every Python exception back to a drgn error. There may be a small performance cost to doing so, and what's more: it can be quite useful to know that a Python error was wrapped into a drgn error. So for now, we'll make this a special case for FaultErrors, so that custom memory readers will behave as expected. Signed-off-by: Stephen Brennan <[email protected]>
Right now, we're a bit conservative in the stack tracing code: we only allow unwinding userspace cores, or Linux kernel programs. These are the only two types of programs for which we can get initial registers. However, if the user provides a pt_regs object, then there's no reason we can't try to do a stack trace with that. Move the check for live programs into drgn_get_initial_registers(), after we've already handled pt_regs. Signed-off-by: Stephen Brennan <[email protected]>
This script is definitely not perfect: it assumes that the task is running in the same filesystem namespace as this script. It also cannot do much to handle unwinding through JIT compiled code, or anything which doesn't have a file mapping for which we can find debuginfo (though usually, this sort of code uses frame pointers). It tries its best to avoid getting misled by files which have changed on-disk (e.g. if a shared library is updated by your package manager). Signed-off-by: Stephen Brennan <[email protected]>
Feel free to cherry-pick what you want if you're happy with those commits and I can always drop them in the rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I cherry-picked the error handling patches. I have one question about "libdrgn: stack_trace: allow unwinding anything with a pt_regs" before I cherry-pick that one.
This has always been a bit of a far-off idea, but with the module API it's working, some of the time (definitely not all of the time). So I thought I'd share it to see if some of the tweaks necessary to make it happen would be reasonable.
Basically, there's a GDB script called
pstack
that attaches GDB and takes a stack trace of a program. You can also use/proc/$PID/stack
to get the kernel stack (assuming it's running in kernel mode or blocked). I was hoping to come up with a way to replicate that behavior in drgn, in a way that would work against/proc/kcore
or/proc/vmcore
. Essentially, it would allow you to get userspace stack traces from the crashed kernel (but not necessarily the whole core dump, likecontrib/gcore.py
) in the kdump kernel just before or after dumping the vmcore. (Presumably, userspace pages are filtered for 99.9% of all kdump configurations, so you'd need to run this while/proc/kcore
is still available).The main part of this requires creating a custom program that has a memory reader, as well as specifying all the required Modules, their biases, and their address ranges. From there, you can get the userspace
struct pt_regs
from the kernel program, copy it to the user program, and then unwind the stack.To do this, I've needed to tweak drgn a bit:
drgn_error
has the wrong error code. So I added a special-case to pass through fault-errors back to the drgn error. This could be made more general, but I don't actually think it would be good to do that generally..gnu_debugdata
which is helpful for my use case.At the end of the day, I'm quite confident that none of this is ready for merge, but I did wonder if any of the individual changes make sense to include?
For fun, here's the result of running the
contrib/pstack.py
script against the current bash process: