-
Notifications
You must be signed in to change notification settings - Fork 1.9k
perf: optimize strpos by eliminating double iteration for UTF-8 #19572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For non-ASCII strings, the original implementation used string.find() to get the byte index, then counted characters up to that byte index. This required two passes through the string. This optimization uses char_indices() to find the substring while simultaneously tracking character positions, completing the search in a single pass. Benchmark results (UTF-8 strings): - str_len_8: 188.98 µs → 140.54 µs (25.4% faster) - str_len_32: 615.69 µs → 294.15 µs (52.2% faster) - str_len_128: 2.2707 ms → 1.2462 ms (45.1% faster) - str_len_4096: 74.328 ms → 36.538 ms (50.9% faster) ASCII performance unchanged (already optimized with fast path).
d190f85 to
5ed05f7
Compare
|
run benchmark character_length |
This comment was marked as outdated.
This comment was marked as outdated.
(I just added this to the whitelist and will run now) |
|
run benchmark character_length |
|
🤖 |
|
run benchmark strpos |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark strpos |
|
🤖 |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @viirya -- this makes sense to me
| for (byte_idx, _) in string.char_indices() { | ||
| char_pos += 1; | ||
| if byte_idx + substring_bytes.len() <= string_bytes.len() | ||
| && &string_bytes[byte_idx..byte_idx + substring_bytes.len()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could potentially use string_bytes.get_unchecked here if it makes any difference as you validate the bounds check immediately above.
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
For non-ASCII strings, the original implementation used string.find() to get the byte index, then counted characters up to that byte index. This required two passes through the string.
This optimization uses char_indices() to find the substring while simultaneously tracking character positions, completing the search in a single pass.
Benchmark results (UTF-8 strings):
ASCII performance unchanged (already optimized with fast path).
Are these changes tested?
Are there any user-facing changes?