Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Intl.Collator implementation using ICU4C #1413

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Commits on May 30, 2024

  1. Add Intl.Collator implementation using ICU4C

    Classes without ICU / Platform API dependency:
    * Constants - String constants for options bag’s names and values,
    valid option values, other constants.
    * IntlUtils - Conversion between UTF-8 and UTF-16 ASCII strings,
    convert string to bool, i.e. case- insensitive match against
    “true”, lowercase ASCII strings.
    * LocaleBCP47Object - Wrapper over a
    BCP47Parser::ParsedLocaleIdentifier, provides factory method to
    create instances from a BCP47 locale string, methods to get
    canonicalized locale string and locale string without extensions,
    and implements the CanonicalizeLocaleList() spec operation.
    * OptionHelpers - Gets string, bool, and number option from options
    bag.
    
    ICU dependent classes:
    * Collator - Intl.Collator implementation using ICU collator API.
    * Locale - Conversion between BCP47 and ICU locale strings.
    * LocaleResolver - Implements ResolveLocale() and SupportedLocales()
    spec operations, supports both “lookup” and “best fit” locale matching.
    “best fit” locale matching uses ICU acceptLanguage API.
    
    Fix memory leaks in existing DateTimeFormat implementation in
    PlatformIntlICU.cpp by wrapping created ICU object pointers in
    unique_ptr.
    
    Migrate ResolveLocale, SupportedLocalesOf, GetCanonicalLocales
    API implementation to use the corresponding newly added supporting
    classes.
    
    Add test cases to verify invalid locales and options input would result
    in throwing JS exception in collation.js. Add tests cases to verify
    resolution of locale extensions and options in a new file
    collation-resolved-options.js.
    
    [Testing]
    
    Build succeeds on Ubuntu 20.04. Run the following JS tests on Ubuntu
    20.04, and all pass except for a few test cases in
    date-time-format-apple.js due to locale data differences. The output
    of running date-time-format-apple.js test is the same as before this
    change.
    
    ```
    $ ./build/bin/hermes ./hermes/test/hermes/intl/intl.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/get-canonical-locales.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/get-canonical-locales.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/collator.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/collator.js
    $ LC_ALL=fr_FR _HERMES_TEST_LOCALE=fr_FR ./build/bin/hermes ./hermes/test/hermes/intl/collator-resolved-options.js
    $ TZ=GMT ./build/bin/hermes ./hermes/test/hermes/intl/date-time-format-apple.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/date-time-format-apple.js
    ```
    
    Run valgrind with above test runs and no memory leak detected.
    
    Run Test262 tests for Collator, String-LocaleCompare, DateTimeFormat,
    Date, and Intl. All non-skipped test cases pass.
    
    ```
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Collator/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/String/prototype/localeCompare/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/DateTimeFormat/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Date/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Intl/
    ```
    robchu05 committed May 30, 2024
    Configuration menu
    Copy the full SHA
    817477c View commit details
    Browse the repository at this point in the history

Commits on Jul 18, 2024

  1. Update string constants type and move new files into impl_icu directory

    Change type of string constants in Constants.h to char16_t[] from
    std::u16string to avoid global constructors and unnecessary dynamic
    allocations. Also, change structs to namespaces and specify inline
    constexpr so they are compile-time constants and can be shared
    across multiple .cpp files.
    
    Move those newly added files for ICU-based implementation in
    lib/Platform/Intl into lib/Platform/Intl/impl_icu/.
    
    Review (const reference) std::(u16)string usages in function parameters
    and change them to std::(u16)string_view if implementation does not need
    a std::(u16)string object or a null-terminated C-style string, to avoid
    unnecessary temporary std::(u16)string object construction.
    
    [Testing]
    
    Build succeeds on Ubuntu 20.04. Run the following JS tests on Ubuntu
    20.04, and all pass except for a few test cases in
    date-time-format-apple.js due to locale data differences. The output
    of running date-time-format-apple.js test is the same as before this
    change.
    
    ```
    $ ./build/bin/hermes ./hermes/test/hermes/intl/intl.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/get-canonical-locales.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/get-canonical-locales.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/collator.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/collator.js
    $ LC_ALL=fr_FR _HERMES_TEST_LOCALE=fr_FR ./build/bin/hermes ./hermes/test/hermes/intl/collator-resolved-options.js
    $ TZ=GMT ./build/bin/hermes ./hermes/test/hermes/intl/date-time-format-apple.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/date-time-format-apple.js
    ```
    
    Run valgrind with above test runs and no memory leak detected.
    
    Run Test262 tests for Collator, String-LocaleCompare, DateTimeFormat,
    Date, and Intl. All non-skipped test cases pass.
    
    ```
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Collator/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/String/prototype/localeCompare/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/DateTimeFormat/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Date/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Intl/
    ```
    robchu05 committed Jul 18, 2024
    Configuration menu
    Copy the full SHA
    77545fb View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2024

  1. Avoid static global constant with dynamically allocated structures

    Change the cache of available locales to a function local static from
    a static class member in LocaleResolver.
    
    Turn classes with only static methods to namespaced functions.
    
    Use arrays for small constant sets and arrays of pairs for small
    constant maps.
    
    [Testing]
    
    Build succeeds on Ubuntu 20.04. Run the following JS tests on Ubuntu
    20.04, and all pass except for a few test cases in
    date-time-format-apple.js due to locale data differences. The output
    of running date-time-format-apple.js test is the same as before this
    change.
    
    ```
    $ ./build/bin/hermes ./hermes/test/hermes/intl/intl.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/get-canonical-locales.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/get-canonical-locales.js
    $ ./build/bin/hermes ./hermes/test/hermes/intl/collator.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/collator.js
    $ LC_ALL=fr_FR _HERMES_TEST_LOCALE=fr_FR ./build/bin/hermes ./hermes/test/hermes/intl/collator-resolved-options.js
    $ TZ=GMT ./build/bin/hermes ./hermes/test/hermes/intl/date-time-format-apple.js | ./build/bin/FileCheck --match-full-lines ./hermes/test/hermes/intl/date-time-format-apple.js
    ```
    
    Run valgrind with above test runs and no memory leak detected.
    
    Run Test262 tests for Collator, String-LocaleCompare, DateTimeFormat,
    Date, and Intl. All non-skipped test cases pass.
    
    ```
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Collator/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/String/prototype/localeCompare/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/DateTimeFormat/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Date/
    $ ./hermes/utils/testsuite/run_testsuite.py -b ./build/bin --test-intl -a ~/Downloads/test262/test/intl402/Intl/
    ```
    robchu05 committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    31add87 View commit details
    Browse the repository at this point in the history