Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of the RtxtParser. #9839

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dellis1972
Copy link
Contributor

@dellis1972 dellis1972 commented Feb 25, 2025

There are some small improvements in the RtxtParser that can be done to improve the performance of the parser. This commit includes the following changes:

  1. Update the enum to be byte. This saves us a few bytes per instance.
  2. Reorder the fields to reduce padding.
  3. Encode is a type is a stylable into the RType enum.
  4. Pre-allocate the array so we reduce the number of allocations.

The following table shows the improvements for 10000 items. We have to use a large number to
see the improvements.

Method Mean Error StdDev Gen0 Gen1 Gen2 Allocated
RtxtParserBaseLine 53.40 ms 0.315 ms 0.246 ms 7500.0000 3000.0000 300.0000 67.32 MB
RtxtParser 48.58 ms 0.192 ms 0.179 ms 7454.5455 2454.5455 272.7273 60.44 MB

There are some small improvements in the RtxtParser that can be done to
improve the performance of the parser. This commit includes the following changes:

1. Update the `enum` to be `byte`. This saves us a few bytes per instance.
2. Reorder the fields to reduce padding.
3. Encode is a type is a stylable into the `RType` enum. This allows us to
   reduce the number of `if` statements in the parser.
@dellis1972
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Comment on lines +109 to +113
var lines = File.ReadAllLines (file);
result.Capacity = lines.Length;

int lineNumber = 0;
foreach (var line in File.ReadLines (file)) {
foreach (var line in lines) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part about this one, File.ReadAllLines() will allocate a string[] as large as the file. This can be faster for small files and worse for large files.

What are the results if you undid this change and just took a "guess" of the result.Capacity to some number like 1024, for example? (Or choose a common size)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch to just reading the file size that should do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants