Skip to content

bpf2go: support multiple source files #1758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yonch opened this issue Apr 18, 2025 · 6 comments · May be fixed by #1762
Open

bpf2go: support multiple source files #1758

yonch opened this issue Apr 18, 2025 · 6 comments · May be fixed by #1762

Comments

@yonch
Copy link

yonch commented Apr 18, 2025

Background and Motivation

Currently, the bpf2go tool only supports compiling a single source file at a time. This limits our ability to create modular eBPF programs composed of multiple source files. Supporting multiple source files would enable better code organization, reusability, and testability for eBPF programs -- improving the developer experience.

Use Cases

  1. Modular Components: Develop reusable eBPF components that can be shared across multiple programs. Complex eBPF programs could be broken into logical components with separate source files
  2. Testability: Create test harnesses for eBPF code without including test code in production builds

Concrete Example - RMID Allocation in Unvariance Collector

In the Unvariance Collector project, we need to allocate Resource Monitoring IDs (RMIDs) to processes. These RMIDs come from a limited space (typically 0 to ~460 on certain processors). We want to:

  1. Create a library for RMID allocation/deallocation with a free list
  2. Track metadata for each RMID and maintain state consistency
  3. Test this functionality in isolation
  4. Include this code in the main collector without the test harness

Ideally, we would:

  • Develop the RMID allocation as a self-contained package with its own header file
  • Create a test harness in a separate file that includes and tests this functionality
  • Link both during testing
  • Link just the RMID allocation code (without test harness) with the main collector

unvariance/collector#109 for more info

Proposed Implementation

After examining the codebase (particularly in cmd/bpf2go), I suggest the following changes:

  1. Modify the b2g struct in main.go to accept multiple source files:

    • Change SourceFile string to SourceFiles []string
    • Update the command-line parsing to collect multiple positional arguments
  2. Refactor the convert() function:

    • Extract the compilation logic in convert function into compileOne
    • When multiple source files are provided, compile each one in sequence
    • This separation allows compiling multiple source files before linking
  3. Add a linking step when multiple source files are provided:

    • Create a new function in the gen package (e.g., Link) similar to existing Compile
    • The function would invoke bpftool gen object to combine multiple object files into one
    • Add the linking step when multiple source files are provided
  4. Handle dependencies correctly when multiple files are used:

    • Parse dependencies for each source file (already present)
    • Merge the dependencies into a single set for the final object file (new functionality)
    • Write dependencies (already present)

Backward Compatibility

When only one source file is provided, the tool should behave exactly as it does now. This ensures backward compatibility with existing workflows.

Related Issues and Discussions

  • [Discussion #1142](bpf2go should be able to generate Go code from object files built out-of-band #1142): This discussion considered generating Go code from a given object file, which would support this use case by allowing users to create multiple object files out-of-band and use bpftool to link them together. However, as mentioned in that discussion, this approach would require users to understand internal plumbing and handle target-specific details manually. Our proposal solves this by using bpf2go to build all object files automatically without exposing internal complexity.

  • [Issue #466](elf: support bpf object linking (bpftool gen object) #466): This issue calls for support of BPF object linking and explicitly mentions bpftool gen object, which we intend to use. However, it focuses specifically on handling weak symbols. While this remains an open issue, users who don't use weak symbols shouldn't encounter these problems with our proposed multi-file support.

I'd be happy to work on implementing this feature. Looking forward to your feedback!

@yonch yonch linked a pull request Apr 21, 2025 that will close this issue
@yonch
Copy link
Author

yonch commented Apr 23, 2025

@lmb I submitted a PR #1762 -- is this a contribution the team will accept, and if so, who would be a good reviewer to approach?

@yonch
Copy link
Author

yonch commented Apr 28, 2025

@ti-mo can you help with #1762 ?

The project moved to libbpf-rs for now, but I'm happy to see this contribution through if done in the next few days while it's fresh. Would appreciate guidance!

@ti-mo
Copy link
Collaborator

ti-mo commented May 6, 2025

The project moved to libbpf-rs for now, but I'm happy to see this contribution through if done in the next few days while it's fresh. Would appreciate guidance!

Thanks for the PR! What's the reason for the move to Rust, if I may ask? I've pulled in a few reviewers explicitly to get things moving.

Ack on the decision to require bpf2go manage the compilation of all objects, that will keep overall internal complexity and API surface down. If users want more control over the compilation process, we can always cater to specific use cases and add more knobs later on. This looks good as a first approach.

On the question of weak symbols, that's not an issue about bpf2go specifically, but rather of the library as a whole. The way bpf2go handles compilation and linking has little impact on this. However, including the feature in bpf2go may give a user the false impression that linked objects are fully supported in the lib, which may not be the case. It would be ideal if someone (you 🙂) could dogfood the implementation so the overall support gets iterated on quickly as issues pop up.

@yonch
Copy link
Author

yonch commented May 7, 2025

The move to Rust was due to parquet libraries; we needed good object store support, and wanted to have control over the amount of buffer memory, and the arrow+parquet implementation in Rust provided an easy pathway.

On weak symbols, I did not deliberately use weak symbols, and this patch worked in our repository. I'll keep an eye out for any issues in future dealings with bpf2go.

Thank you for the PR comments! I'll try and get to those today.

@ti-mo
Copy link
Collaborator

ti-mo commented May 12, 2025

@yonch Thanks for elaborating! Let us know when you've got something ready.

@yonch
Copy link
Author

yonch commented May 23, 2025

Hi folks I've been pretty swamped and haven't gotten around to these. Still on my plate but it is taking a back burner. If you find someone who wants to pick up where I left off I have no objection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants