-
Notifications
You must be signed in to change notification settings - Fork 388
Add tracing_chrome under "tracing" feature #4406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
16e2ce6
to
d432b6b
Compare
I am not sure why tests fail, when I tried locally they failed even on the latest commit on |
I am on May I know what os are you currently using? And is the failure the same as the one in the CI? |
@tiif On my PC the tests that fail run into errors like "error: unsupported operation: extern static
I am also on |
I suspect the CI failure might be triggered by the change in this PR, as there were successful CI runs after this one in https://github.com/rust-lang/miri/actions. (I haven't read the code closely, so this is just a guess :) About the unsupported error, I think maybe the toolchain is outdated. You can try to follow the steps here https://github.com/rust-lang/miri/blob/master/CONTRIBUTING.md#preparing-the-build-environment to update it if you haven't done so recently. If the problem persists, feel free to ask for help in zulip. |
The file was taken unmodified from the following link, except for file attributes at the top: https://github.com/thoren-d/tracing-chrome/blob/7e2625ab4aeeef2f0ef9bde9d6258dd181c04472/src/lib.rs Depending on the tracing-chrome crate from crates.io is unfortunately not possible since it depends on `tracing_core` which conflicts with rustc_private's `tracing_core` (meaning it would not be possible to use the [ChromeLayer] in a context that expects a [Layer] from from rustc_private's `tracing_core` version)
The tracing_chrome Layer has a connected Guard that needs to be dropped before exiting the process, or else data may not be flushed to the tracing file. This required making a few changes to main() to ensure std::process::exit() is not called without making sure to first drop the guard.
The tests pass locally. I rebased on master and force pushed to make the CI rerun, but it still fails. Both now and earlier, I think the cause for the CI failure was "memory allocation of 408 bytes failed" on windows-latest (see here and here). mac also fails due to "No space left on device (os error 28)" (see here). I don't understand why ubuntu fails with code 143 without printing any error. I don't understand how changes in this PR can cause the tests to use more memory (assuming that's the culprit). The behavior of the code in this PR should be the same as before when no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build was failing because the ci/ci.sh
script passed CARGO_EXTRA_FLAGS=--all-features
and thus enabled the tracing feature, which caused every test to collect and save tracing information to file which filled up the worker filesystem leading to the errors above. Now I changed ci/ci.sh
to only include the features relevant for tests, i.e. all but "tracing", i.e. genmc, stack-cache, stack-cache-consistency-check
.
src/bin/miri.rs
Outdated
@@ -211,6 +219,10 @@ impl rustc_driver::Callbacks for MiriCompilerCalls { | |||
if return_code != rustc_driver::EXIT_SUCCESS { | |||
eprintln!("FAILING SEED: {seed}"); | |||
if !many_seeds.keep_going { | |||
// drop the tracing guard before exiting, so tracing calls are flushed correctly | |||
if let Ok(mut lock) = tracing_guard.try_lock() { | |||
let _guard_being_dropped = (*lock).take(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An explicit call to drop
seems more clear here.
But it's not great tha we apparently have to remember to do this in multiple places...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I also don't like that much, but it felt the safest option. An alternative would be to call libc::atexit()
or to use a crate like https://crates.io/crates/ctor, though both options involve unsafe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an idea: move everything in fn after_analysis
that is after init_late_loggers
into a separate inherent method on MiriCompilerCalls
that returns an ExitCode
. Have the guard inside that separate method so it gets dropped when the method returns, and then there's just a single exit
call in after_analysis
afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of that but exit()
is also used inside the par_for_each_in
to exit immediately as soon as anything fails (when many_seeds.keep_going
is false) and afaik there is no easy and safe way to stop executing threads to then drop the guard and return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... yeah true, par_for_each_in
does not support early-return.
Well, then let's at least have a local exit
closure inside this function to reduce the code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, then let's at least have a local exit closure inside this function to reduce the code duplication.
I don't think this has been done?
With the new API, what we probably want is a fn exit
that calls deinit_loggers
, and then use that instead of the one from std::process
.
src/trace/setup.rs
Outdated
guard = Some(TracingGuard { | ||
#[cfg(feature = "tracing")] | ||
_chrome: chrome_guard, | ||
_no_construct: (), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this guard have toe be created here? This then causes all sorts of complexity since you can't know whether it gets created in early-init or late-init.
It seems better to just create the guard separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by creating the guard separately? An alternative would be to pass a parameter around but then we'd have to do the .is_none()
check anyway so I don't see better alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean to have init_*_logger
store the guard somewhere, e.g. inside LOGGER_INITED
, and then have a separate function that fetches the guard from there (and panics if it has already been fetched / not yet been placed there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOGGER_INITED
should probably become a OnceLock
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guard needs to be fetched from main()
and also from after_analysis
. Having the guard in main()
is needed to do the cleanup when the compiler exits. I pushed a commit that implements your suggestion and takes this observation into consideration. While the interface is nicer, it may lead to more confusion, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we even care about the contents of the trace file when there is an error?^^
But anyway, that last version works for me, thanks.
|
@rustbot author |
Reminder, once the PR becomes ready for a review, use |
Add tracing to `InterpCx::layout_of()` This PR adds tracing calls to `instantiate_from_frame_and_normalize_erasing_regions` and to `InterpCx::layout_of()`. The latter is done by shadowing `LayoutOf`'s trait method with an inherent method on `InterpCx`. <details><summary>Previous attempt by overriding the `layout_of` query (includes downloadable `.diff` patch)</summary> This PR is meant for Miri, but requires a few changes in `rustc` code, hence why it's here. It adds tracing capabilities to the `layout_of` function in `tcx` by overriding the `layout_of` query (under `local_providers`) with a wrapper that opens a tracing span and then calls the actual `layout_of`. To make this possible, I had to make `rustc_ty_utils::layout::layout_of` public. I added an assert to ensure the `providers.layout_of` value I am replacing is actually `rustc_ty_utils::layout::layout_of`, just in case. I also considered taking the previous value in `providers.layout_of` and calling that one instead, to avoid making `layout_of` public. But then the closure would not be castable to a function pointer anymore (`providers.layout_of` is a function pointer), because it would depend on the local variable storing the previous value of `providers.layout_of`. Using a global variable would work but would rely on `unsafe` or on `Mutex`es, so I wanted to avoid it. Here is some tracing output when Miri is run on `src/tools/miri/tests/pass/hello.rs`, visualizable in https://ui.perfetto.dev: [trace-1750338860374637.json](https://github.com/user-attachments/files/20820392/trace-1750338860374637.json) Another place where I could have added tracing calls is to the `rustc_middle::ty::layout::LayoutCx` struct / `spanned_layout_of()` function, however there is no simple way to disable the tracing calls with compile-time boolean constants there (since `LayoutCx::new()` is used everywhere and referenced directly), and in any case it seems like `spanned_layout_of()` just calls `tcx.layout_of()` anyway. For completeness' sake, here is tracing output for when a tracing call is added to `spanned_layout_of()`: [trace-1750340887920584.json](https://github.com/user-attachments/files/20820609/trace-1750340887920584.json) Patch to override `layout_of` query: [tracing-layout_of-query-override.diff.txt](https://github.com/user-attachments/files/20944497/tracing-layout_of-query-override.diff.txt) </details> **Note: obtaining tracing output depends on rust-lang/miri#4406, but this PR is standalone and can be merged without waiting for rust-lang/miri#4406 r? `@RalfJung`
Add tracing to `validate_operand` This PR adds a tracing call to keep track of how much time is spent in `validate_operand` and `const_validate_operand`. Let me know if more fine-grained tracing is needed (e.g. adding tracing to `validate_operand_internal` too, which is just called from those two functions). I also fixed the rustdoc of `validate_operand` and `const_validate_operand` since it was referencing an older name for the `val` parameter which was renamed in cbdcbf0. Here is some tracing output when Miri is run on `src/tools/miri/tests/pass/hello.rs`, visualizable in [ui.perfetto.dev](https://ui.perfetto.dev/): [trace-1750932222218210.json](https://github.com/user-attachments/files/20924000/trace-1750932222218210.json) **Note: obtaining tracing output depends on rust-lang/miri#4406, but this PR is standalone and can be merged without waiting for rust-lang/miri#4406 r? `@RalfJung`
Machine::TRACING_ENABLED
totrue
tracing_chrome
crate by copy-pasting this ~600 line file.tracing_core
which conflicts with rustc_private'stracing_core
(meaning it would not be possible to use theChromeLayer
in a context that expects aLayer
from from rustc_private'stracing_core
version).[patch]
and[replace]
sections, but although they would work for normal libraries, they don't seem to behave as expected when the crate to replace comes fromrustc_private
, see this Zulip comment for a list of experimentsmiri.rs
so the file is less cluttered.trace/
(like done now) or should I rather move them underbin/
, since onlybin/miri.rs
uses those functions?ChromeLayer
internally starts a thread to write data to file, and thus relies on a guard to properly flush the file and terminate the thread when dropped. Sincestd::process::exit()
was being called in a few places, I had to restructure the code a bit to avoid exiting directly (which wouldn't calldrop()
on the guard).rustc_driver::catch_fatal_errors()
. After modifyingrun_compiler_then_exit
to not exit directly, I could confirm that the tracing file was being flushed correctly in case of a compiler error by runningRUSTC_LOG=1 ./miri run --features tracing FILE_WITH_SYNTAX_ERRORS
.init_early_loggers()
after argument parsing, is this ok? I did not see any log/trace call during argument parsing anyway. This avoids having to refactorshow_error!()
to not exit() directly