Description
For our project, I implemented a version of Result
which uses Try
trait in such a way as to track the error handling locations, so we get a "call stack" of the error at the end. This works in conjunction with #[track_caller]
very well and we see the locations the error handling took.
However, there is one major deficiency in Location
- it only gives us the source name and line. Yes, with that, it's possible to manually look up the function where it happened, but it would be significantly simpler to evaluate bug reports by looking at the call trace with function names (we have practiced this in a large C++ project, where basically the dev support standardized on initially evaluating everything by function names in the call trace, ignoring line numbers). So it would be extremely useful if the Location
would be extended with another &str
containing the function name (ideally with generics). It can be also a mangled name, I don't care much, but the function name is important.
Before you shout "backtrace!" - yes, but... We are using heavy asynchronous processing, handling errors across await
s, where the backtrace has about zero value. Similar for tracing, we can't just keep the trace active permanently due to performance reasons. So the error handling "call stack" is a perfect compromise - cheap and sufficiently valuable (except that it's missing the function name).
According to my code study of the Rust compiler code, it should basically boil down to getting the function name from the current Span
and adding it in addition to the file name to the generated const Location
here:
I raised this initially in a comment in #47809. Here some observations from the discussion there:
- It's clear that adding the function name will increase generated string section (by potentially quite a bit), because instead of a single string per file we'll need to store potentially many strings per file.
- It's also clear that compilation speed might be minimally affected (only where the location is actually used, though), because instead of a single string with the file name another string with the function name must be created.
To alleviate this, storing function name could be made optional at compilation time, e.g., using a compiler flag.
Of course, one could argue that this could be also done in post-processing of traced errors. Of course, that's a possibility. But that's another step, which makes it quite cumbersome to use and needs a lot of resources and exact source code. The compiler already knows it at compile time and aside from costing more space in the generated executable (string section), there should be no adverse effects of having the function name in the Location
.
Further, having the function name in Location
, it would in general allow building various tools which use the location for debugging purposes, which in turn would help the community in general. So I think it's worth extending it.
@anp, @eddyb & @aturon you seem to have worked on this topic recently, so CC to you.
Activity
eddyb commentedon Mar 31, 2022
You shouldn't need to. You can use something like
self.instance
to get theInstance
that describes the function you're in, and you can probably get access to the mangled symbol efficiently enough (something liketcx.symbol_name(self.instance)
I would assume - it might even be cached around from codegen needing it already for the function definition itself).I wouldn't store the "function name" but rather the mangled symbol (frankly, ideally a
fn
pointer to the function itself, so that you can use e.g. debuginfo instead, if you want, but that would inhibit optimizations sadly).With the now-impending v0 mangling transition, the mangled symbol will both contain generics, and be more efficient than the expanded string. We could maybe even force v0 mangling for this usecase and ignore other settings.
For demangling, I don't think it would be a stretch to include
rustc-demangle
incore
-std
does so already, for backtraces (cc @yaahc), and if it were incore
they could share it.schreter commentedon Mar 31, 2022
Well, I'm no expert in compiler building, my focus is databases, so I'd prefer to leave the implementation to the compiler experts who know what they are doing :-). But sure, if there is a more efficient way, that's even better.
Yes, I'd personally also prefer the mangled symbol.
Mhm. A crazy idea: What about putting the names into a special ELF string section (not regular initialized data or debuginfo), so it could be trimmed, if needed (resulting in
None
for function names). I.e., it would generate references into this extra string section and the dynamic linker would either resolve them, if the section exists, or set them to NULL if not. Not sure if the dynamic linker can do such a feat, though. Alternatively,Location
could store only 32-bit string offset and size into this section and then based on whether the starting symbol of this section is found or not at runtime, it would either generate the name or not. OTOH, this should be known in advance and then instead of trimming, one can as well tell the compiler not to generate function names forLocation
via a flag.eddyb commentedon Mar 31, 2022
Sorry, what I should've said was that
Span
refers to "source code location", which function is being compiled (and, crucially, which choices of generic parameters) is elsewhere. There's not really any reasonable way to go from "position in file" to "function definition path + generics", anyway, efficiencies aside.Ideas for
Location
using different sections has been brought up before (#70579) and it definitely seems plausible if we can come up with an amenable user interface for it (e.g. how do you make "stripping" remove the data from the binary but reserve the right range so it gets zeroed in memory or something).Sadly custom debuginfo is pretty tricky in LLVM, otherwise that could be interesting to experiment with.
yaahc commentedon Mar 31, 2022
We pretty much abandoned the idea of moving Backtrace into core, but maybe we could move just the functionality we need here into core?
WaffleLapkin commentedon Apr 1, 2022
There is some previous discussion about getting the function name in https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/macro.20for.20getting.20the.20function.20name/near/269583366
BGR360 commentedon May 20, 2022
Piping in to say that I desire the exact same thing for the exact same use case (return-tracing powered by
Try
and#[track_caller]
).I also brought this up in an old zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/function.20names.20in.20.60.23.5Btrack_caller.5D.60
I am willing to work on the implementation of this if I get buy-in and a little bit of mentoring from somebody. We really desire this at my place of work and I'm on a team where I can dedicate work hours to doing this.
schreter commentedon Jun 26, 2022
@BGR360 Were you already able to work on the topic? We would like to use it as well (that's why I opened the issue), but I don't have anyone with the right skillset to implement it and/or help you with mentoring. The only help I can offer is what I found in rustc code (where the
Location
is generated) and what I linked above.BGR360 commentedon Jun 26, 2022
@schreter thanks for the reminder. I started tinkering with this today while on the plane.
I think I'll be able to get something working :)
@rustbot claim
BGR360 commentedon Jun 27, 2022
I do have a question if anyone can help me.
I started out implementing this for the rustc_const_eval interpreter context, and leaving the function as
<unknown>
for anyconst_caller_location
users.But when I tried to test that, all the function names from
Location::caller()
came out as<unknown>
. I realized this must be because rustc is going down theconst_caller_location
path for those, even when not used in aconst
context.Is there any way I can write a UI test that is guaranteed to go through the
InterpCtx::get_closest_untracked_caller_location
path? I tried doingcompile-flags: -C opt-level=0
but that didn't seem to be enough.If so, I'll update the existing UI tests accordingly, cuz I assume we're missing out on code coverage for the
InterpCtx
path.eddyb commentedon Jun 27, 2022
That sounds right, IIRC
const_caller_location
allocates the constant representing aSpan
as a target&'static core::panic::Location
- that is, theLocation
is "const", not the context it's acquired in.You would need to add another argument to it,
Option<ty::Instance<'tcx>>
perhaps?Though if you use
tcx.symbol_name(self.instance)
from codegen, it might be more efficient to takeOption<SymbolName>
(or whatever thesymbol_name
return type is) as the extra argument.Either way, you'd have to pass that in to be able to add it to the
Location
. Also I would strongly advise against emitting"<unknown>"
as a string literal, and instead useOption<&'static str>
or similar for the relevantLocation
field (which would ideally contain a mangled symbol).Also, when used from a const context, miri definitely also has the same amount of information anyway, so the
Option
is probably unnecessary once you get that working.In terms of dealing with miri internals and other CTFE details, cc @oli-obk
BGR360 commentedon Jun 27, 2022
Oh, why's that? I was thinking of just using
Instance
as the other part of the key.Yeah I originally wanted to use
Option<&'a str>
but I didn't know how to properly write anOption
field usingPlace::write_*. It was simpler to model it off the existing code (which currently puts
""as the filename if
-Z location-detail` has been set a certain way).Yep, that's my plan. Do you think
Location
wants to provide a public method to demangle it? I read the chatter above saying thatstd
already does demangling stuff, but I didn't find it exposed publicly anywhere.juntyr commentedon Nov 16, 2023
@BGR360 Were you able to make further progress on your implementation? I stumbled across this issue yesterday after working on my own error wrapper that collects the location of every construction/propagation. Could I help contribute to prototype an implementation?
hudson-ayers commentedon Nov 19, 2023
It would be good to profile the size cost of any implementation of this -- adding filename / line number already has a significant cost for embedded
no_std
binaries that include panic handlers (#70580). If it is added I think we should make it excludable vialocation-detail
(#89920), but that is a nightly-only flag so the cost matters either way.3 remaining items