Add tracing_chrome under "tracing" feature #4406

Stypox · 2025-06-18T15:39:24Z

Added a "tracing" feature that enables Chrome traces, and sets Machine::TRACING_ENABLED to true
I ended up adding the tracing_chrome crate by copy-pasting this ~600 line file.
- As discussed previously, depending on the tracing-chrome crate from crates.io is unfortunately not possible since it depends on tracing_core which conflicts with rustc_private's tracing_core (meaning it would not be possible to use the ChromeLayer in a context that expects a Layer from from rustc_private's tracing_core version).
- I tried to use cargo's [patch] and [replace] sections, but although they would work for normal libraries, they don't seem to behave as expected when the crate to replace comes from rustc_private, see this Zulip comment for a list of experiments
- Also see more dicussion in this Zulip thread
- I am open to trying other stuff to avoid copy-pasting foreign code into Miri, but I didn't want to waste more time on this right now, as it's blocking any work on tracing.
- Do I need to mention the author and license at the top of the copied file, or is a link (like I've done now) enough for attribution?
I moved the logger/tracing setup functions out of miri.rs so the file is less cluttered.
- Should I put those new files under trace/ (like done now) or should I rather move them under bin/, since only bin/miri.rs uses those functions?
The ChromeLayer internally starts a thread to write data to file, and thus relies on a guard to properly flush the file and terminate the thread when dropped. Since std::process::exit() was being called in a few places, I had to restructure the code a bit to avoid exiting directly (which wouldn't call drop() on the guard).
- As far as I understand rustc does not call exit(), but rather raises an unwinding panic which is then caught by rustc_driver::catch_fatal_errors(). After modifying run_compiler_then_exit to not exit directly, I could confirm that the tracing file was being flushed correctly in case of a compiler error by running RUSTC_LOG=1 ./miri run --features tracing FILE_WITH_SYNTAX_ERRORS.
- I moved the call to init_early_loggers() after argument parsing, is this ok? I did not see any log/trace call during argument parsing anyway. This avoids having to refactor show_error!() to not exit() directly

Stypox · 2025-06-19T11:56:27Z

I am not sure why tests fail, when I tried locally they failed even on the latest commit on master ( 2f4f9ac ). I did a git bisect and it told me that either ab135f0 or 42f66f4 are the first bad commit (the first one doesn't build, so I skipped it, while the second is the first where ./miri test failed). However this does not explain why the CI on 2f4f9ac passed.

tiif · 2025-06-19T12:27:01Z

I am on x86_64-unknown-linux-gnu, ./miri test for current master branch (2f4f9ac) passed locally for me.

May I know what os are you currently using? And is the failure the same as the one in the CI?

Stypox · 2025-06-19T14:15:52Z

@tiif On my PC the tests that fail run into errors like "error: unsupported operation: extern static __rust_no_alloc_shim_is_unstable is not supported by Miri". Even just running ./miri run tests/pass/hello.rs triggerrs the same error. But the CI exits with code 143 without printing any error (I'm looking at ubuntu-latest):

2025-06-19T10:55:01.6057618Z ##[error]The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
2025-06-19T10:55:01.6062974Z ##[error]Process completed with exit code 143.
2025-06-19T10:55:01.6983400Z Cleaning up orphan processes

I am also on x86_64-unknown-linux-gnu, my OS is Manjaro Linux with kernel 5.15.182-1-MANJARO, I have a AMD Ryzen 7 5800H cpu, let me know if you need anything else.

tiif · 2025-06-19T15:06:56Z

I suspect the CI failure might be triggered by the change in this PR, as there were successful CI runs after this one in https://github.com/rust-lang/miri/actions. (I haven't read the code closely, so this is just a guess :)

About the unsupported error, I think maybe the toolchain is outdated. You can try to follow the steps here https://github.com/rust-lang/miri/blob/master/CONTRIBUTING.md#preparing-the-build-environment to update it if you haven't done so recently.

If the problem persists, feel free to ask for help in zulip.

The file was taken unmodified from the following link, except for file attributes at the top: https://github.com/thoren-d/tracing-chrome/blob/7e2625ab4aeeef2f0ef9bde9d6258dd181c04472/src/lib.rs Depending on the tracing-chrome crate from crates.io is unfortunately not possible since it depends on `tracing_core` which conflicts with rustc_private's `tracing_core` (meaning it would not be possible to use the [ChromeLayer] in a context that expects a [Layer] from from rustc_private's `tracing_core` version)

The tracing_chrome Layer has a connected Guard that needs to be dropped before exiting the process, or else data may not be flushed to the tracing file. This required making a few changes to main() to ensure std::process::exit() is not called without making sure to first drop the guard.

Stypox · 2025-06-25T10:15:23Z

The tests pass locally. I rebased on master and force pushed to make the CI rerun, but it still fails.

Both now and earlier, I think the cause for the CI failure was "memory allocation of 408 bytes failed" on windows-latest (see here and here). mac also fails due to "No space left on device (os error 28)" (see here). I don't understand why ubuntu fails with code 143 without printing any error.

I don't understand how changes in this PR can cause the tests to use more memory (assuming that's the culprit). The behavior of the code in this PR should be the same as before when no --tracing is passed.

Stypox

The build was failing because the ci/ci.sh script passed CARGO_EXTRA_FLAGS=--all-features and thus enabled the tracing feature, which caused every test to collect and save tracing information to file which filled up the worker filesystem leading to the errors above. Now I changed ci/ci.sh to only include the features relevant for tests, i.e. all but "tracing", i.e. genmc, stack-cache, stack-cache-consistency-check.

ci/ci.sh

src/trace/tracing_chrome.rs

src/bin/miri.rs

RalfJung · 2025-06-26T12:07:04Z

src/bin/miri.rs

@@ -211,6 +219,10 @@ impl rustc_driver::Callbacks for MiriCompilerCalls {
                if return_code != rustc_driver::EXIT_SUCCESS {
                    eprintln!("FAILING SEED: {seed}");
                    if !many_seeds.keep_going {
+                        // drop the tracing guard before exiting, so tracing calls are flushed correctly
+                        if let Ok(mut lock) = tracing_guard.try_lock() {
+                            let _guard_being_dropped = (*lock).take();


An explicit call to drop seems more clear here.

But it's not great tha we apparently have to remember to do this in multiple places...

Yeah I also don't like that much, but it felt the safest option. An alternative would be to call libc::atexit() or to use a crate like https://crates.io/crates/ctor, though both options involve unsafe.

Here's an idea: move everything in fn after_analysis that is after init_late_loggers into a separate inherent method on MiriCompilerCalls that returns an ExitCode. Have the guard inside that separate method so it gets dropped when the method returns, and then there's just a single exit call in after_analysis afterwards.

I thought of that but exit() is also used inside the par_for_each_in to exit immediately as soon as anything fails (when many_seeds.keep_going is false) and afaik there is no easy and safe way to stop executing threads to then drop the guard and return

Hm... yeah true, par_for_each_in does not support early-return.

Well, then let's at least have a local exit closure inside this function to reduce the code duplication.

Well, then let's at least have a local exit closure inside this function to reduce the code duplication.

I don't think this has been done?
With the new API, what we probably want is a fn exit that calls deinit_loggers, and then use that instead of the one from std::process.

src/bin/miri.rs

RalfJung · 2025-06-26T12:13:45Z

src/trace/setup.rs

+        guard = Some(TracingGuard {
+            #[cfg(feature = "tracing")]
+            _chrome: chrome_guard,
+            _no_construct: (),
+        });


Why does this guard have toe be created here? This then causes all sorts of complexity since you can't know whether it gets created in early-init or late-init.

It seems better to just create the guard separately?

What do you mean by creating the guard separately? An alternative would be to pass a parameter around but then we'd have to do the .is_none() check anyway so I don't see better alternatives.

I mean to have init_*_logger store the guard somewhere, e.g. inside LOGGER_INITED, and then have a separate function that fetches the guard from there (and panics if it has already been fetched / not yet been placed there).

LOGGER_INITED should probably become a OnceLock for this.

The guard needs to be fetched from main() and also from after_analysis. Having the guard in main() is needed to do the cleanup when the compiler exits. I pushed a commit that implements your suggestion and takes this observation into consideration. While the interface is nicer, it may lead to more confusion, what do you think?

Why do we even care about the contents of the trace file when there is an error?^^

But anyway, that last version works for me, thanks.

src/trace/setup.rs

src/trace/tracing_chrome.rs

rust-lang#4406 (comment)

Stypox · 2025-06-27T08:09:02Z

Removed $FEATURES from ci.sh's ./miri install
Added docs to TracingGuard, improved docs in tracing_chrome.rs and made other comments follow the "Abc def." style (instead of "abc def")
impl Drop for TracingGuard
Renamed run_compiler_and_exit to run_compiler_return_exit_code
Moved tracing_chrome and logger setup into bin/trace folder. Putting them under bin/ directly meant that each file would be interpreted as a binary as far as I understand, that's why I had to put them under an additional trace/ subfolder.

src/bin/miri.rs

RalfJung · 2025-06-27T12:55:26Z

@rustbot author

rustbot · 2025-06-27T12:55:29Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

Add tracing to `InterpCx::layout_of()` This PR adds tracing calls to `instantiate_from_frame_and_normalize_erasing_regions` and to `InterpCx::layout_of()`. The latter is done by shadowing `LayoutOf`'s trait method with an inherent method on `InterpCx`. <details><summary>Previous attempt by overriding the `layout_of` query (includes downloadable `.diff` patch)</summary> This PR is meant for Miri, but requires a few changes in `rustc` code, hence why it's here. It adds tracing capabilities to the `layout_of` function in `tcx` by overriding the `layout_of` query (under `local_providers`) with a wrapper that opens a tracing span and then calls the actual `layout_of`. To make this possible, I had to make `rustc_ty_utils::layout::layout_of` public. I added an assert to ensure the `providers.layout_of` value I am replacing is actually `rustc_ty_utils::layout::layout_of`, just in case. I also considered taking the previous value in `providers.layout_of` and calling that one instead, to avoid making `layout_of` public. But then the closure would not be castable to a function pointer anymore (`providers.layout_of` is a function pointer), because it would depend on the local variable storing the previous value of `providers.layout_of`. Using a global variable would work but would rely on `unsafe` or on `Mutex`es, so I wanted to avoid it. Here is some tracing output when Miri is run on `src/tools/miri/tests/pass/hello.rs`, visualizable in https://ui.perfetto.dev: [trace-1750338860374637.json](https://github.com/user-attachments/files/20820392/trace-1750338860374637.json) Another place where I could have added tracing calls is to the `rustc_middle::ty::layout::LayoutCx` struct / `spanned_layout_of()` function, however there is no simple way to disable the tracing calls with compile-time boolean constants there (since `LayoutCx::new()` is used everywhere and referenced directly), and in any case it seems like `spanned_layout_of()` just calls `tcx.layout_of()` anyway. For completeness' sake, here is tracing output for when a tracing call is added to `spanned_layout_of()`: [trace-1750340887920584.json](https://github.com/user-attachments/files/20820609/trace-1750340887920584.json) Patch to override `layout_of` query: [tracing-layout_of-query-override.diff.txt](https://github.com/user-attachments/files/20944497/tracing-layout_of-query-override.diff.txt) </details> **Note: obtaining tracing output depends on rust-lang/miri#4406, but this PR is standalone and can be merged without waiting for rust-lang/miri#4406 r? `@RalfJung`

Add tracing to `validate_operand` This PR adds a tracing call to keep track of how much time is spent in `validate_operand` and `const_validate_operand`. Let me know if more fine-grained tracing is needed (e.g. adding tracing to `validate_operand_internal` too, which is just called from those two functions). I also fixed the rustdoc of `validate_operand` and `const_validate_operand` since it was referencing an older name for the `val` parameter which was renamed in cbdcbf0. Here is some tracing output when Miri is run on `src/tools/miri/tests/pass/hello.rs`, visualizable in [ui.perfetto.dev](https://ui.perfetto.dev/): [trace-1750932222218210.json](https://github.com/user-attachments/files/20924000/trace-1750932222218210.json) **Note: obtaining tracing output depends on rust-lang/miri#4406, but this PR is standalone and can be merged without waiting for rust-lang/miri#4406 r? `@RalfJung`

Stypox force-pushed the tracing branch 2 times, most recently from 16e2ce6 to d432b6b Compare June 19, 2025 10:31

Stypox marked this pull request as ready for review June 19, 2025 10:32

Stypox mentioned this pull request Jun 19, 2025

Add tracing to InterpCx::layout_of() rust-lang/rust#142721

Open

Stypox added 3 commits June 25, 2025 11:29

Add tracing feature

09752c7

Stypox force-pushed the tracing branch from d432b6b to 201c5c7 Compare June 25, 2025 09:30

Stypox force-pushed the tracing branch from 2f4d5f0 to 102bb96 Compare June 25, 2025 14:09

Only enable selected features when testing miri in CI

3b37e77

Stypox force-pushed the tracing branch from 102bb96 to 3b37e77 Compare June 25, 2025 14:26

Stypox commented Jun 25, 2025

View reviewed changes

ci/ci.sh Outdated Show resolved Hide resolved

Stypox mentioned this pull request Jun 26, 2025

Add tracing to validate_operand rust-lang/rust#143051

Open

RalfJung reviewed Jun 26, 2025

View reviewed changes

Stypox added 6 commits June 27, 2025 08:47

Remove $FEATURES from ci.sh's ./miri install

2b1e9e0

rust-lang#4406 (comment)

Improve tracing_chrome.rs comments

d7ffc25

Add docs and impl Drop for TracingGuard

5a04d8c

Improve docs style from 'a b' to 'A b.'

42dac13

Better name for run_compiler_return_exit_code

b8e2d28

Move tracing/logging setup under bin/

7cee097

Stypox force-pushed the tracing branch from f9e9f33 to 7cee097 Compare June 27, 2025 08:04

RalfJung reviewed Jun 27, 2025

View reviewed changes

src/bin/miri.rs Show resolved Hide resolved

Stypox added 2 commits June 27, 2025 11:17

Init late loggers after checking for fatal errors

601fc94

Store tracing guard in LOGGER_INITED and add deinit_loggers

b29953b

rustbot added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Jun 27, 2025

Add tracing_chrome under "tracing" feature #4406

Are you sure you want to change the base?

Add tracing_chrome under "tracing" feature #4406

Conversation

Stypox commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stypox commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiif commented Jun 19, 2025

Uh oh!

Stypox commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiif commented Jun 19, 2025

Uh oh!

Stypox commented Jun 25, 2025

Uh oh!

Stypox left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Stypox Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Stypox commented Jun 27, 2025

Uh oh!

Uh oh!

RalfJung commented Jun 27, 2025

Uh oh!

rustbot commented Jun 27, 2025

Uh oh!

Uh oh!

Stypox commented Jun 18, 2025 •

edited

Loading

Stypox commented Jun 19, 2025 •

edited

Loading

Stypox commented Jun 19, 2025 •

edited

Loading

Stypox Jun 27, 2025 •

edited

Loading