Skip to content

gpu offload host code generation #142097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Jun 5, 2025

r? ghost

This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Befure merging, we will use LLVM debug infrastructure to verify that the memcopies match what openmp offloa generates in C++.

A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.

I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@ZuseZ4 ZuseZ4 added F-gpu_offload `#![feature(gpu_offload)]` and removed A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check-2 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    Checking rustc_hir_typeck v0.0.0 (/checkout/compiler/rustc_hir_typeck)
error: unused variable: `ti8`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:686:17
    |
686 |             let ti8 = cx.type_i8();
    |                 ^^^ help: if this is intentional, prefix it with an underscore: `_ti8`
    |
    = note: `-D unused-variables` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(unused_variables)]`

error: unused variable: `global`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:694:17
    |
694 |             let global = cx.declare_global("my_struct_global", offload_entry_ty);
    |                 ^^^^^^ help: if this is intentional, prefix it with an underscore: `_global`

error: unused variable: `global`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:695:17
    |
695 |             let global = cx.declare_global("my_struct_global2", kernel_arguments_ty);
    |                 ^^^^^^ help: if this is intentional, prefix it with an underscore: `_global`

error: unused variable: `o_sizes`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:729:21
    |
729 |                 let o_sizes = add_priv_unnamed_arr(&cx, &format!(".offload_sizes.{num}"), &vec![8u64,0,16,0]);
    |                     ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_o_sizes`

error: unused variable: `o_types`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:730:21
    |
730 |                 let o_types = add_priv_unnamed_arr(&cx, &format!(".offload_maptypes.{num}"), &vec![800u64, 544, 547, 544]);
    |                     ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_o_types`

error: unused variable: `size_ty`
   --> compiler/rustc_codegen_llvm/src/back/lto.rs:708:25
    |
708 |                     let size_ty = cx.type_array(ti64, vals.len() as u64);
    |                         ^^^^^^^ help: if this is intentional, prefix it with an underscore: `_size_ty`

[RUSTC-TIMING] rustc_codegen_llvm test:false 3.347
error: could not compile `rustc_codegen_llvm` (lib) due to 6 previous errors
warning: build failed, waiting for other jobs to finish...
[RUSTC-TIMING] rustc_mir_transform test:false 4.817

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-gpu_offload `#![feature(gpu_offload)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants