Skip to content

Conversation

lyonbeckers
Copy link

@lyonbeckers lyonbeckers commented Sep 23, 2025

Only provides function name to CallContext on debug builds for a fair bit of an optimization on release builds.

Result of the benchmarks:

with changes
   -- call_callv_rust_fn         ...      0.856μs      0.880μs
   -- call_callv_custom          ...      0.750μs      0.767μs
without
   -- call_callv_rust_fn         ...      1.126μs      1.164μs
   -- call_callv_custom          ...      0.791μs      0.808μs

Less of a gain on the custom call side, but I think that's mostly just because GString.to_string is that much more expensive, or your mileage may vary depending on what the implementation of std::fmt::Display is for a RustCallable in the wild.

Bonus: I added the option to provide a release argument to check.sh, but let me know if I should put that up in a separate PR. Just made testing this so much easier (although I feel like I should clarify that the benchmarks above were not debug vs release, to be clear).

What an adventure this has been 😅

I'm creating this as a draft PR because when I ran the workflow on my fork, it failed a couple checks, failing the codegen_test > changed_enum_apis, but I'm not quite sure how that could be.

Copy link
Member

@Bromeon Bromeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

with opt

You mean with the changes of this PR (and not "with optimized builds"), right?
If yes, that's really impressive 👀

@lyonbeckers
Copy link
Author

Thanks a lot!

with opt

You mean with the changes of this PR (and not "with optimized builds"), right? If yes, that's really impressive 👀

Yeah sorry, to be clear, that should maybe be "with the changes" and "without the changes". I'll edit to be clearer

@GodotRust
Copy link

API docs are being generated and will be shortly available at: https://godot-rust.github.io/docs/gdext/pr-1331

@Bromeon Bromeon added c: core Core components performance Performance problems and optimizations labels Sep 23, 2025
@lyonbeckers lyonbeckers marked this pull request as ready for review September 24, 2025 10:19
@TitanNano
Copy link
Contributor

From a user's perspective, it's a bit strange that the callable name gets "optimized away". Seeing "optimized away" usually means some kind of "debug Symbol stripping", but the callable name is still there; it's just not being used.
My question would be, could we just optimize the call by requiring a &str so the string is only allocated once? That would be a breaking change but could be the better user experience.

@lyonbeckers
Copy link
Author

lyonbeckers commented Sep 24, 2025

From a user's perspective, it's a bit strange that the callable name gets "optimized away". Seeing "optimized away" usually means some kind of "debug Symbol stripping", but the callable name is still there; it's just not being used. My question would be, could we just optimize the call by requiring a &str so the string is only allocated once? That would be a breaking change but could be the better user experience.

I'm not quite sure if I'm understanding what you mean, but the the string isn't necessarily there, it's provided as a GString from userdata, which if I'm understanding correctly has to be converted to a String in order to be passed as a &str. I think the docs mention that GString stores the string data differently, so some conversion needs to take place. I don't know enough about it to determine why it is a GString, if it's something we pass to userdata or are provided with, and if it is because it's needed on the engine side, or that we'd have the option to do otherwise. So I'd have to defer to someone else to speak to that.

As for RustCallable, the string isn't really there, it's composed from the implementation of Display. I'm also pretty ignorant on this point, but I assume this was chosen for flexibility.

Either way, after #1328 I'm a little gunshy to get too deep into a sweeping change considering this is pretty low hanging fruit. If there's a better solution that lets us keep the function name I'd say maybe it can be a separate PR.

@Bromeon
Copy link
Member

Bromeon commented Sep 24, 2025

Seeing "optimized away" usually means some kind of "debug Symbol stripping", but the callable name is still there; it's just not being used.

Hm, the debug stripping is also where I got this from, but isn't it the same from user PoV? We perform an optimization to not display the callable name 🤔

@lyonbeckers could you maybe add a small test that checks Callable::to_string() in Release mode? There's a linux-release CI job that compiles with --release, and some existing tests already use this distinction. This would demonstrate the behavior as it's seen by godot-rust users.


Either way, after #1328 I'm a little gunshy to get too deep into a sweeping change considering this is pretty low hanging fruit. If there's a better solution that lets us keep the function name I'd say maybe it can be a separate PR.

I agree on this, small incremental PRs are quite nice 🙂

@TitanNano
Copy link
Contributor

Hm, the debug stripping is also where I got this from, but isn't it the same from user PoV? We perform an optimization to not display the callable name 🤔

We could still use the name if we just optimize the string allocation by i.e. caching it somewhere, instead of just not using the name we already got. I feel like we either go all the way and remove the callable name entirely (in non debug builds) or we just optimize the critical part without restricting usability.

I agree on this, small incremental PRs are quite nice 🙂

yes I also think keeping it smal is good. I wanted to put it out there because the impact of the optimization feels a bit arbitrary. "Why is the name not available here but in other places it is?"

We can also move this to a separate PR or issue if you think it's worth pursuing.

@Bromeon
Copy link
Member

Bromeon commented Sep 24, 2025

Very good points! Now I see that this is indeed a bit a strange middle-ground between caching and discarding the string.

Caching might actually be quite simple, it's used in two ways:

  • as String, the location changed here:

    let name = {
    let w: &FnWrapper<F> = CallableUserdata::inner_from_raw(callable_userdata);
    w.name.to_string()
    };

  • as GString, when being invoked by Godot's to_string FFI callback:

    pub unsafe extern "C" fn rust_callable_to_string_named<F>(
    callable_userdata: *mut std::ffi::c_void,
    r_is_valid: *mut sys::GDExtensionBool,
    r_out: sys::GDExtensionStringPtr,
    ) {
    let w: &mut FnWrapper<F> = CallableUserdata::inner_from_raw(callable_userdata);
    w.name.clone().move_into_string_ptr(r_out);
    *r_is_valid = sys::conv::SYS_TRUE;
    }

So storing an additional String might already bring some benefits.

Would be interesting to look if there's a notable perf difference between caching and entirely discarding the string.
Of course storing 2 string fields also has memory overhead (+ allocation), but could maybe be omitted in Release (or with the fast-unsafe profile in the future)...

@lyonbeckers
Copy link
Author

Added tests for the expected outcomes in release vs debug. I think this is a sensible way of testing it? I based it a little off of the test for dynamic_call_with_panic.

Would be interesting to look if there's a notable perf difference between caching and entirely discarding the string.

Heyo we've got some benches now, that'll be handy 😎

@lyonbeckers
Copy link
Author

Not sure why the test failed on CI, it passes on ./check itest --release. Taking a look at some of the other logs in the output of linux-release though and I'm seeing <optimized out> produced by other tests, so something's going on. I'll try running the tests manually (ie: not via ./check) to try and match what the CI is doing

@Bromeon
Copy link
Member

Bromeon commented Sep 25, 2025

so something's going on

#1336, so rebase onto master once that's merged 👌

@lyonbeckers
Copy link
Author

Oh, I noticed the clippy issue, but would it have impacted the test? I guess we'll see 🤞

@lyonbeckers
Copy link
Author

lyonbeckers commented Sep 26, 2025

Ah, okay. Figured it out. My test was only passing locally because I hadn't changed the gdextension path in the itest godot project. Don't know how I overlooked that I'd have to do that, oops. On that note, I'll probably just remove the --release flag I added from check to avoid any further confusion that may cause.

The way the test was working was by intercepting the panic received. The test fails because we actually don't report the call context in release build panics. The call context is passed to ERROR_CONTEXT_STACK which actually gets compiled out in release anyway.

#[cfg(debug_assertions)]
ERROR_CONTEXT_STACK.with(|cell| unsafe {
// SAFETY: &error_context is valid for lifetime of function, and is removed from LAST_ERROR_CONTEXT before end of function.
cell.borrow_mut().push_function(&error_context)
});

The context is also passed to call_error as a CallError and then processed with report_call_error, but that's just for printing a log via godot_error!. We don't forward the CallError, but I don't think that would help us anyway.

fn report_call_error(call_error: CallError, track_globally: bool) -> i32 {
// Print failed calls to Godot's console.
// TODO Level 1 is not yet set, so this will always print if level != 0. Needs better logic to recognize try_* calls and avoid printing.
// But a bit tricky with multiple threads and re-entrancy; maybe pass in info in error struct.
if has_error_print_level(2) {
godot_error!("{call_error}");
}
// Once there is a way to auto-remove added errors, this could be always true.
if track_globally {
call_error_insert(call_error)
} else {
0
}
}

I might be able to intercept it from this track_globally path, but we pass false into the relevant calls and I don't know what the implications are in changing that.

So unless there's some way of hooking into the godot console output that I'm not aware of, I don't actually know if there's a way I actually can write a release mode test for this, at least without making further changes to the caller?

@Yarwin
Copy link
Contributor

Yarwin commented Sep 26, 2025

I'll check if we can use https://docs.godotengine.org/en/stable/classes/class_logger.html anyhow.

@Yarwin
Copy link
Contributor

Yarwin commented Sep 26, 2025

Quick napkin code points out that logger might be good call. Tomorrow/at Sunday we might think out how to actually use it in the itest.

As Godot docs points out: Warning: This function may be called from multiple different threads, so you may need to do your own locking.

    #[derive(GodotClass)]
    #[class(init, base = Logger)]
    pub struct GodotRustLogger {
        error: Mutex<String>,
        base: Base<Logger>
    }

    #[godot_api]
    impl ILogger for GodotRustLogger {
        fn log_error(&mut self, function: GString, file: GString, line: i32, code: GString, rationale: GString, editor_notify: bool, error_type: i32, backtrace: Array<godot::prelude::Gd<godot::classes::ScriptBacktrace>>) {
            let mut val = self.error.lock().unwrap();
            *val = code.to_string();
        }
    }

    #[itest(focus)]
    fn callable_context_custom_panic() {
        // We need to poke ScriptBacktrace's `class_id` – otherwise it might be tried to be initialized simultaneously by two threads at once, causing the panic. Might/should be done in init although :thinking:.
        let _script_backtrace = Gd::<godot::classes::ScriptBacktrace>::class_id();
        let logger = GodotRustLogger::new_gd();
        Os::singleton().add_logger(&logger);
        let callable_name = if cfg!(debug_assertions) {
            "test()"
        } else {
            "<optimized out>"
        };
        let expected_msg = format!("godot-rust function call failed: <Callable>::{}()\n    Reason: function panicked: TEST: 0", callable_name);
(...)
        assert_eq!(
            *logger.bind().error.lock().unwrap(),
            expected_msg
        );
Run Godot integration tests...
  Focused run -- execute only selected Rust tests.
  Rust: found 1 tests in 1 files.

   callable_test.rs:
ERROR: godot-rust function call failed: <Callable>::<optimized out>()
    Reason: function panicked: TEST: 0                                                                                                                                                                                              
   at: godot_core::private::report_call_error (godot-core/src/private.rs:490)                                                                                                                                                       
   GDScript backtrace (most recent call first):
       [0] _ready (res://TestRunner.gd:77)                                                                                                                                                                                          
Mutex { data: "godot-rust function call failed: <Callable>::<optimized out>()\n    Reason: function panicked: TEST: 0", poisoned: false, .. }
   -- callable_context_custom_panic ... ok

Test result: ok. 1 passed; 0 failed (focused run).
  Time: 0.00s.
  Benchmarks skipped (focused run).

==============================                                                                                                                                                                                                      
godot-rust: checks SUCCESSFUL.                                                                                                                                                                                                      
==============================                                                                                                                                                                                                      
                                                                                                                                                                                                                                    
Total duration: 19s.

@Yarwin
Copy link
Contributor

Yarwin commented Sep 28, 2025

I created a branch with support for Godot's Logger to be added after the new prebuilt arrives :).

It will be used like so:

    #[cfg(all(not(debug_assertions), since_api = "4.5"))]
    #[itest(focus)]
    fn callable_context_custom_panic() {
        let expected_msg = "godot-rust function call failed: <Callable>::<optimized out>()\n    Reason: function panicked: TEST: 0";
        let logger = crate::common::itest_logger(GString::from(expected_msg));

        let received = Arc::new(AtomicU32::new(0));
        let callable = Callable::from_custom(PanicCallable(received));
        callable.callv(&varray![]);
        assert!(logger
            .bind()
            .was_error_called());
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: core Core components performance Performance problems and optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants