Skip to content

Make usage_item_array.data stable #1255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 10, 2025
Merged

Conversation

yukawa
Copy link
Collaborator

@yukawa yukawa commented Apr 8, 2025

Description

Previously the content of usage_item_array.data was processed with std::sort rather than std::stable_sort. As a result, the content can easily vary depending on platforms (to be precise STL implementations).

By switching to std::stable_sort, we can assume the same content will be generated across platforms, which would be a huge win when diagnosing conversion related issues.

Note that gen_usage_rewriter_dictionary_main.cc runs as a build step. Thus there should be no major performance implications at runtime.

Closes #1254.

Issue IDs

Steps to test new behaviors (if any)

A clear and concise description about how to verify new behaviors (if any).

  • OS: All
  • Steps:
    1. bazelisk //data_manager/oss:mozc_dataset_for_oss --config oss_windows (on Windows)
    2. bazelisk //data_manager/oss:mozc_dataset_for_oss --config oss_macos (on macOS)
    3. bazelisk //data_manager/oss:mozc_dataset_for_oss --config oss_linux (on Linux)
    4. Confirm bazel-bin/data_manager/oss/usage_item_array.data is the same.

Previously the content of 'usage_item_array.data' was processed with
'std::sort' rather than 'std::stable_sort'. As a result, the content can
easily vary depending on platforms (to be precise STL implementations).

By switching to 'std::stable_sort', we can assume the same content will
be generated across platforms, which would be a huge win when diagnosing
conversion related issues.

Note that 'gen_usage_rewriter_dictionary_main.cc' runs as a build step.
Thus there should be no major performance implications at runtime.

Closes google#1254.
@hiroyuki-komatsu hiroyuki-komatsu merged commit 0152eb3 into google:master Apr 10, 2025
1 check passed
@hiroyuki-komatsu
Copy link
Collaborator

We have merged your PR.
Thank you for the contribution!

@yukawa yukawa deleted the issue_1254 branch April 10, 2025 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

usage_item_array.data is not stable due to the usage of std::sort
2 participants