Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libbpf-tools/offcputime: Add -P option to post unwind user stack backtrace using dwarf unwind via libunwind #4463

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ekyooo
Copy link
Contributor

@ekyooo ekyooo commented Feb 5, 2023

This is not the finished version for review. Before I finish, I'd like to know that this feature is required by bcc.
Can you please let me know if post unwind using dwarf unwind via libunwind is needed and it helps with issues like #3515?

FYI, there are a few things left to do for this PR if needed:

  • Added ARCH dependent code for x86
  • Dump user stack and regs when in kernel mode
  • Add libunwind for build
  • etc.
  1. test environment
    Tested with profile in aarch64 environment.
    However, the profile's review PR is not in progress, so I apply it to offcputime to share the code.

  2. This is the test results.
    Before:

# ./profile -d -p `pidof test-strlen-abc-nofp-unwtbl`
Sampling at 49 Hertz of PID 83030 by user + kernel stack... Hit Ctrl-C to end.
^C    
    #0  0x0000007fbe74cc20 strerrordesc_np+0x1980 (/usr/lib/libc.so.6+0x9cc20)
    #1  0x0000007fbe6db30c __libc_start_main+0x9c (/usr/lib/libc.so.6+0x2b30c)
    #2  0x000000556f9c0754 _start+0x34 (/var/rootdirs/home/root/test-strlen-abc-nofp-unwtbl+0x754)
    -                test-strlen-abc (83030)
        290

After:

# ./profile -P -d -p `pidof test-strlen-abc-nofp-unwtbl`
Sampling at 49 Hertz of PID 83030 by user + kernel stack... Hit Ctrl-C to end.
^C    
    #0  0x0000007fbe74cc20 strerrordesc_np+0x1980 (/usr/lib/libc.so.6+0x9cc20)
    #1  0x000000556f9c087c c+0x30 (/var/rootdirs/home/root/test-strlen-abc-nofp-unwtbl+0x87c)
    #2  0x000000556f9c0840 b+0x8 (/var/rootdirs/home/root/test-strlen-abc-nofp-unwtbl+0x840)
    #3  0x000000556f9c082c a+0x8 (/var/rootdirs/home/root/test-strlen-abc-nofp-unwtbl+0x82c)
    #4  0x000000556f9c08c4 main+0x10 (/var/rootdirs/home/root/test-strlen-abc-nofp-unwtbl+0x8c4)
    #6  0x0000007fbe6db230 __libc_init_first+0x70 (/usr/lib/libc.so.6+0x2b230)
    -                test-strlen-abc (83030)
        1
  1. test executable information
    Test executable has no frame pointer, but has an unwind table (.eh_frame).
    3.1 compile option : -fomit-frame-pointer -funwind-tables
    3.2 code
#include <assert.h>
#include <stdlib.h>
#include <string.h>

char buf[8*1024*1024];

int a(void);
int b(void);
int c(void);

typedef size_t (*strlen_t)(const char *); 
strlen_t getlen = strlen;

int a(void)
{
        return b() - 1;
}

int b(void)
{
        return c() + 1;
}

int c() 
{
        memset(buf, '+', sizeof(buf)-1);

        while (1) {
                size_t len = getlen(buf);
                assert(len == sizeof(buf) - 1); 
        }

        // not reached
        return 1;
}

int main(int argc, char **argv)
{
        a();
}

Thank you.

@ekyooo ekyooo changed the title libbpf-tools/offcputime: Add -P option to post unwind user stack backtrace using dwarf unwind via libunwind [draft] libbpf-tools/offcputime: Add -P option to post unwind user stack backtrace using dwarf unwind via libunwind Feb 5, 2023
@yonghong-song
Copy link
Collaborator

@davemarchevsky could you also take a look at this patch?

@davemarchevsky davemarchevsky self-assigned this Feb 9, 2023
@yonghong-song
Copy link
Collaborator

I think this could be a useful feature. In most bcc tools, the user space gets stack_id and a list of addresses for the stack and then reconstruct the symbolized stack strace based recorded pid. If the application compiled with omitting frame pointer, it would be hard from the kernel to figure out the correct stack trace. In these cases, having a backup using user space libunwind seems a reasonable alternative. So please go ahead to finish your implementation. I see you use stack size 128. This may not be enough for real world big application which could be up to 4k. I think we should make stack size as part of the input for -P. If not specified, you could use 128 bytes.

@ekyooo
Copy link
Contributor Author

ekyooo commented Feb 13, 2023

Thank you for your answers and comments.
I'll go ahead to finish the implementation.
Thank you.

@ekyooo
Copy link
Contributor Author

ekyooo commented Jul 19, 2023

Not yet subject to review. Thank you.

@ekyooo ekyooo force-pushed the unwind_offcputime branch 2 times, most recently from 032adab to d689460 Compare September 6, 2023 11:03
@ekyooo ekyooo changed the title [draft] libbpf-tools/offcputime: Add -P option to post unwind user stack backtrace using dwarf unwind via libunwind libbpf-tools/offcputime: Add -P option to post unwind user stack backtrace using dwarf unwind via libunwind Sep 6, 2023
@@ -26,6 +29,10 @@ USE_BLAZESYM ?= 1
endif
endif

ifeq ($(ARCH),x86)
USE_LIBUNWIND = 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please tell me if LIBUNWIND should be enabled by default?

case 'P':
env.unwind = true;
break;
case OPT_SAMPLE_STACK_SIZE:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make stack size as part of the input for -P. If not specified, you could use 128 bytes.

offcputime has a positional argument for the duration, so -P cannot be set to OPTION_ARG_OPTIONAL.
Therefore, a separate option for stack size has been added.

@@ -26,6 +29,10 @@ USE_BLAZESYM ?= 1
endif
endif

ifeq ($(ARCH),x86)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, only build libunwind for x86 is added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test example using the test application in the description.

# ./offcputime -P --sample-stack-size 1024 -p `pidof test-strlen-abc-nofp-unwtbl` 
^C    
    bpf_prog_99b48c45746a8a74_sched_switch 
    bpf_prog_99b48c45746a8a74_sched_switch   
    bpf_trace_run3                           
    __traceiter_sched_switch                 
    __schedule                               
    schedule                                 
    exit_to_user_mode_prepare                
    irqentry_exit_to_user_mode               
    asm_sysvec_apic_timer_interrupt          
    strlen                                   
    c                                        
    b                                        
    a                                        
    main                                     
    __libc_start_main                        
    _start                                   
    [unknown]                                
    [unknown]                                
    -                test-strlen-abc (26307) 
        50                                   

test with -v option.

# ./offcputime -P -v --sample-stack-size 1024 -p `pidof test-strlen-abc-nofp-unwtbl` 
    #0  0xffffffffc001259c bpf_prog_99b48c45746a8a74_sched_switch+0x4bc         
    #1  0xffffffffc001259c bpf_prog_99b48c45746a8a74_sched_switch+0x4bc         
    #2  0xffffffff8116952e bpf_trace_run3+0x3e                                  
    #3  0xffffffff8109be54 __traceiter_sched_switch+0x34                        
    #4  0xffffffff81c5f7ba __schedule+0x30a                                     
    #5  0xffffffff81c5fbff schedule+0x3f                                        
    #6  0xffffffff810f3766 exit_to_user_mode_prepare+0x56                       
    #7  0xffffffff81c5d6d5 irqentry_exit_to_user_mode+0x5                       
    #8  0xffffffff81e00c42 asm_sysvec_apic_timer_interrupt+0x12                 
    #9  0x00007f9797d6fbc6 strlen+0x106 (/lib/x86_64-linux-gnu/libc-2.19.so+0x88bc6)                                                                            
    #10 0x00005582b6600716 c+0x2a (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x716)                                                                           
    #11 0x00005582b66006e4 b+0x9 (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x6e4)                                                                            
    #12 0x00005582b66006d3 a+0x9 (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x6d3)                                                                            
    #13 0x00005582b6600756 main+0x11 (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x756)                                                                        
    #14 0x00007f9797d08ec5 __libc_start_main+0xf5 (/lib/x86_64-linux-gnu/libc-2.19.so+0x21ec5)                                                                  
    #15 0x00005582b66005ea _start+0x2a (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x5ea)                                                                      
    #16 0xffffffff81c5d6d5 (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x5ea)  
    #17 0xffffffff81e00c42 (/root/es/unwind/test-strlen-abc-nofp-unwtbl+0x5ea)  
    -                test-strlen-abc (26307)                                    
        54                                                                      

Thank you for waiting so long.
Could you please review this?
Thank you.

@ekyooo
Copy link
Contributor Author

ekyooo commented Sep 8, 2023

In the last update, this feature is only built if USE_LIBUNWIND is enabled in the Makefile.
Thank you.

#include <libunwind-ptrace.h>
#include "unwind_types.h"

enum log_level {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a log level only for this feature(post unwinding)? Could you please tell me the role of this enum log_level?

struct unw_data data;
};

int unw_init(struct unw_info *u, pid_t pid, size_t user_stack_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about mentioning the meaning of each parameters for each APIs?

@@ -355,9 +427,25 @@ int main(int argc, char **argv)
*/
sleep(env.duration);

#ifdef USE_LIBUNWIND
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing USE_LIBUNWIND macros?

#ifdef USE_LIBUNWIND
if (env.unwind) {
if (post_unwind(u, sample_fd, ustack_fd, next_key.user_stack_id,
ip, env.perf_max_stack_depth) != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if ip were the last parameter.

@ekyooo
Copy link
Contributor Author

ekyooo commented Jan 11, 2024

@Bojun-Seo
Thank you for your comment.
Including your feedback, I've found many things I need to fix and improve in this patch.
Well, I'll come back soon with a revised patch.
Thanks again.

@ekyooo ekyooo marked this pull request as draft January 31, 2024 06:27
…ng libunwind

unwind.bpf.h: bpf helper API for user stack and register dumps
unwind_helpers: helper API for post dwarf unwind using libunwind
Use installed libunwind
(Only builds for x86 are supported in this patch.): check

Signed-off-by: Eunseon Lee <[email protected]>
…wind_helpers

Add -P option to use post dwarf unwind with libunwind.
With the option, bpf program dump user stack and user regs.
and user program get user stack traces using unwind_helpers to show output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants