Improve source code for `highlight.rs` #146992

GuillaumeGomez · 2025-09-24T16:34:12Z

I was very bothered by the complexity of this file, in particular the handling of pending_elems which was very tricky to follow.

So instead, here comes a more sane approach: the content is store in a stack-like type which handles "levels" of HTML (ie, a macro expansion can contain other HTML tags which can themselves contain other, etc). Making it much simpler to keep track of what's going on.

r? @lolbinarycat

GuillaumeGomez · 2025-09-24T16:34:52Z

Also need to check the impact on performance (likely slower).

@bors try @rust-timer queue

Improve source code for `highlight.rs`

rust-bors · 2025-09-24T18:48:51Z

☀️ Try build successful (CI)
Build commit: 6020c97 (6020c97e3046a35e53fd9885eda164c570010a6c, parent: 15283f6fe95e5b604273d13a428bab5fc0788f5a)

yotamofek · 2025-09-24T19:32:30Z

src/librustdoc/html/highlight.rs

+        let mut closing_tag = None;
+        for part in &self.content {
+            let text: &dyn Display =
+                if part.needs_escape { &EscapeBodyText(&part.text) } else { &part.text };


FYI, Either impls Display, which can be nicer than a dyn ref (and maybe slightly more performant)

Good idea, thanks!

yotamofek · 2025-09-24T19:38:05Z

src/librustdoc/html/highlight.rs

+            for part in elem.content.drain(..) {
+                last.content.push(part);
+            }


Suggested change

for part in elem.content.drain(..) {

last.content.push(part);

}

last.content.append(&mut elem.content);

Both shorter and might also be slightly more performant (can probably pre-reserve just enough capacity in target vector)

yotamofek · 2025-09-24T19:39:47Z

src/librustdoc/html/highlight.rs

+                for elem in elements {
+                    self.elements.push(elem);
+                }


Suggested change

for elem in elements {

self.elements.push(elem);

}

self.elements.extend(elements);

Same deal as https://github.com/rust-lang/rust/pull/146992/files#r2376863766

yotamofek · 2025-09-24T19:44:11Z

Code reads much better IMHO!

rust-timer · 2025-09-24T19:59:22Z

Finished benchmarking commit (6020c97): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	2.2%	[0.7%, 6.6%]	19
Regressions ❌ (secondary)	3.2%	[0.1%, 13.3%]	17
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.2%	[0.7%, 6.6%]	19

Max RSS (memory usage)

Results (primary 0.4%, secondary 12.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	6.2%	[6.2%, 6.2%]	1
Regressions ❌ (secondary)	12.0%	[2.6%, 31.9%]	6
Improvements ✅ (primary)	-2.5%	[-2.8%, -2.2%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.4%	[-2.8%, 6.2%]	3

Cycles

Results (primary 2.8%, secondary 5.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.3%	[3.2%, 6.5%]	3
Regressions ❌ (secondary)	8.3%	[3.6%, 12.8%]	6
Improvements ✅ (primary)	-1.6%	[-1.6%, -1.6%]	1
Improvements ✅ (secondary)	-1.5%	[-1.5%, -1.4%]	2
All ❌✅ (primary)	2.8%	[-1.6%, 6.5%]	4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 471.213s -> 470.794s (-0.09%)
Artifact size: 387.83 MiB -> 387.93 MiB (0.03%)

GuillaumeGomez · 2025-09-24T22:39:50Z

Code reads much better IMHO!

Agreed (unsurprisingly 😆), but sadly I think this solution is unlikely to get much better performance-wise so unlikely it'll be merged.

However I now have a much cleaner code, so I think I'll go back to the original "streaming content" but with a much cleaner approach.

lolbinarycat · 2025-09-25T16:31:24Z

If I had to guess the reason for the perf regression, I would say it probably has to do with all the extra intermediate string allocations. I feel like if we had a way to delay formatting (maybe using an enum, or with dyn Display, or just make it so the final writer is given up front), this would have a lot less overhead.

GuillaumeGomez · 2025-09-25T16:46:33Z

Possibly. Want to give a try pushing it even further before I try to turn this back into a streaming algorithm? Same question for you @yotamofek. 😉

Start from my branch and open PRs with your commits so we can run perf check on them.

yotamofek · 2025-09-25T17:07:32Z

I'll give it a go, but my gut says the extra string allocations aren't causing the lion's share of the regressions. Worth a shot though

bors · 2025-09-26T01:17:09Z

☔ The latest upstream changes (presumably #147037) made this pull request unmergeable. Please resolve the merge conflicts.

bors · 2025-10-06T16:52:19Z

☔ The latest upstream changes (presumably #147397) made this pull request unmergeable. Please resolve the merge conflicts.

…re HTML tags than necessary

rustbot · 2025-10-07T12:42:56Z

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

GuillaumeGomez · 2025-10-07T13:05:38Z

So I realized that there are some bugs in the existing code which are fixed by this PR:

In the example above, only self should be underlined, not the whitespace. Anyway, looking a bit more into this performance issue.

GuillaumeGomez · 2025-10-07T15:51:18Z

@bors2 try @rust-timer queue

Improve source code for `highlight.rs`

rust-bors · 2025-10-07T18:06:56Z

☀️ Try build successful (CI)
Build commit: 1629a3c (1629a3c85e350f2f8dfc39a15d3e40c3b895d4fc, parent: 4a54b26d30dac43778afb0e503524b763fce0eee)

rust-timer · 2025-10-07T20:35:06Z

Finished benchmarking commit (1629a3c): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.5%	[0.4%, 4.4%]	19
Regressions ❌ (secondary)	3.0%	[0.2%, 9.3%]	12
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.6%	[-0.6%, -0.5%]	3
All ❌✅ (primary)	1.5%	[0.4%, 4.4%]	19

Max RSS (memory usage)

Results (primary 4.8%, secondary 13.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.8%	[4.8%, 4.8%]	1
Regressions ❌ (secondary)	13.1%	[4.1%, 32.0%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	4.8%	[4.8%, 4.8%]	1

Cycles

Results (primary 3.6%, secondary 4.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.6%	[3.2%, 4.0%]	2
Regressions ❌ (secondary)	5.1%	[1.7%, 9.3%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.6%	[-2.6%, -2.6%]	1
All ❌✅ (primary)	3.6%	[3.2%, 4.0%]	2

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 473.216s -> 474.103s (0.19%)
Artifact size: 388.39 MiB -> 388.39 MiB (0.00%)

GuillaumeGomez · 2025-10-07T22:29:49Z

A bit better. I have some idea to reduce the memory usage as well.

GuillaumeGomez · 2025-10-08T11:31:01Z

Now let's see if flushing more often makes it better for memory and CPU.

@bors try @rust-timer queue

Improve source code for `highlight.rs`

rust-bors · 2025-10-08T13:46:11Z

☀️ Try build successful (CI)
Build commit: 5ce920c (5ce920c89699da2a07c1b7692830149792b9b657, parent: 5767910cbcc9d199bf261a468574d45aa3857599)

rust-timer · 2025-10-08T15:18:04Z

Finished benchmarking commit (5ce920c): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	1.4%	[0.4%, 4.1%]	19
Regressions ❌ (secondary)	2.3%	[0.2%, 8.4%]	15
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.0%	[-0.0%, -0.0%]	2
All ❌✅ (primary)	1.4%	[0.4%, 4.1%]	19

Max RSS (memory usage)

Results (primary 2.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.4%	[2.4%, 2.4%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.4%	[2.4%, 2.4%]	1

Cycles

Results (primary 2.4%, secondary 3.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.4%	[2.4%, 2.4%]	1
Regressions ❌ (secondary)	3.1%	[1.5%, 4.3%]	6
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.4%	[2.4%, 2.4%]	1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 473.657s -> 473.163s (-0.10%)
Artifact size: 388.42 MiB -> 388.42 MiB (0.00%)

GuillaumeGomez · 2025-10-08T15:21:50Z

Memory usage dropped by a lot (still 2.4% more) and performance loss is now around 8.4%. Much better. :)

yotamofek · 2025-10-08T21:26:51Z

Memory usage dropped by a lot (still 2.4% more) and performance loss is now around 8.4%. Much better. :)

Nice!
Perf regression is getting very close to being worth the extra code clarity. I'll see if I have any other optimization ideas next week.

krtab · 2025-10-09T22:17:16Z

I think this helps restore correct performances: krtab@2cffe1f

Basically, we are creating a whole lot of elements with a one sized vec before merging them together, and all this allocations were previously discared, this reuses one such vec as much as possible.

rustbot assigned lolbinarycat Sep 24, 2025

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Sep 24, 2025

Auto merge of #146992 - GuillaumeGomez:improve-highlight, r=<try>

6020c97

Improve source code for `highlight.rs`

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 24, 2025

This comment has been minimized.

Sign in to view

yotamofek reviewed Sep 24, 2025

View reviewed changes

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 24, 2025

GuillaumeGomez force-pushed the improve-highlight branch from dfcb725 to 957ca47 Compare September 27, 2025 21:25

This comment has been minimized.

Sign in to view

yotamofek mentioned this pull request Sep 27, 2025

Small highlight.rs optimizations #147106

Open

GuillaumeGomez added 2 commits October 7, 2025 14:04

Improve source code for highlight.rs

99801ed

Make compatible stack elements "glue" together to prevent creating mo…

bb5e335

…re HTML tags than necessary

GuillaumeGomez force-pushed the improve-highlight branch from 957ca47 to 2ae1bf4 Compare October 7, 2025 12:42

Improve performance

e6e8891

GuillaumeGomez force-pushed the improve-highlight branch from 2ae1bf4 to e6e8891 Compare October 7, 2025 15:50

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Oct 7, 2025

Auto merge of #146992 - GuillaumeGomez:improve-highlight, r=<try>

1629a3c

Improve source code for `highlight.rs`

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 7, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 7, 2025

Flush elements when there are too many

d160270

This comment has been minimized.

Sign in to view

rust-bors bot added a commit that referenced this pull request Oct 8, 2025

Auto merge of #146992 - GuillaumeGomez:improve-highlight, r=<try>

5ce920c

Improve source code for `highlight.rs`

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 8, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 8, 2025

Improve source code for highlight.rs #146992

Are you sure you want to change the base?

Improve source code for highlight.rs #146992

Uh oh!

Conversation

GuillaumeGomez commented Sep 24, 2025

Uh oh!

GuillaumeGomez commented Sep 24, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Sep 24, 2025

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yotamofek commented Sep 24, 2025

Uh oh!

rust-timer commented Sep 24, 2025

Overall result: ❌ regressions - please read the text below

Uh oh!

GuillaumeGomez commented Sep 24, 2025

Uh oh!

lolbinarycat commented Sep 25, 2025

Uh oh!

GuillaumeGomez commented Sep 25, 2025

Uh oh!

yotamofek commented Sep 25, 2025

Uh oh!

bors commented Sep 26, 2025

Uh oh!

This comment has been minimized.

bors commented Oct 6, 2025

Uh oh!

rustbot commented Oct 7, 2025

Uh oh!

GuillaumeGomez commented Oct 7, 2025

Uh oh!

GuillaumeGomez commented Oct 7, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Oct 7, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Oct 7, 2025

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

GuillaumeGomez commented Oct 7, 2025

Uh oh!

GuillaumeGomez commented Oct 8, 2025

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Oct 8, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Oct 8, 2025

Overall result: ❌ regressions - please read the text below

Uh oh!

GuillaumeGomez commented Oct 8, 2025

Uh oh!

yotamofek commented Oct 8, 2025

Uh oh!

krtab commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

Improve source code for `highlight.rs` #146992

Improve source code for `highlight.rs` #146992

krtab commented Oct 9, 2025 •

edited

Loading