-
Notifications
You must be signed in to change notification settings - Fork 157
Add query cache hits to detailed query table #2168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
rust-bors bot
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk` try-job: dist-powerpc-linux
bors
added a commit
to rust-lang/rust
that referenced
this pull request
Jul 2, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
github-actions bot
pushed a commit
to rust-lang/rustc-dev-guide
that referenced
this pull request
Jul 3, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
github-actions bot
pushed a commit
to kosai-ksdfounder/miri
that referenced
this pull request
Jul 3, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
c077e58
to
52d5ea4
Compare
Jamesbarford
requested changes
Jul 8, 2025
1f26cd7
to
5da3cb4
Compare
Jamesbarford
approved these changes
Jul 9, 2025
Ok, let's try this! |
lnicola
pushed a commit
to lnicola/rust-analyzer
that referenced
this pull request
Jul 15, 2025
Add new self-profiling event to cheaply aggregate query cache hit counts Self-profile can record various types of things, some of them are not enabled, like query cache hits. Rustc currently records cache hits as "instant" measureme events, which records the thread ID, current timestamp, and constructs an individual event for each such cache hit. This is incredibly expensive, in a small hello world benchmark that just depends on serde, it makes compilation with nightly go from ~3s (with `-Zself-profile`) to ~15s (with `-Zself-profile -Zself-profile-events=default,query-cache-hit`). We'd like to add query cache hits to rustc-perf (rust-lang/rustc-perf#2168), but there we only need the actualy cache hit counts, not the timestamp/thread ID metadata associated with it. This PR adds a new `query-cache-hit-count` event. Instead of generating individual instant events, it simply aggregates cache hit counts per *query invocation* (so a combination of a query and its arguments, if I understand it correctly) using an atomic counter. At the end of the compilation session, these counts are then dumped to the self-profile log using integer events (in a similar fashion as how we record artifact sizes). I suppose that we could dedup the query invocations in rustc directly, but I don't think it's really required. In local experiments with the hello world + serde case, the query invocation records generated ~30 KiB more data in the self-profile, which was ~10% increase in this case. With this PR, the overhead of `-Zself-profile` seems to be the same as before, at least on my machine, so I also enabled query cache hit counts by default when self profiling is enabled. We should also modify `analyzeme`, specifically [this](https://github.com/rust-lang/measureme/blob/master/analyzeme/src/analysis.rs#L139), and make it load the integer events with query cache hit counts. I can do that as a follow-up, it's not required to be done in sync with this PR, and it doesn't require changes in rustc. CC `@cjgillot` r? `@oli-obk`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds new columns to the detailed query page table. They show the number of cache hits of individual queries.
To make this work, I implemented a cheaper way of aggregating query cache hits (rust-lang/rust#142978) and integrated it into analyzeme (rust-lang/measureme#244).