Skip to content

Conversation

@eggrobin
Copy link
Member

@eggrobin eggrobin commented Oct 4, 2024

[UTC-181-C10] Consensus: Provisionally assign 2 code points U+AB6C LATIN CAPITAL LETTER SCRIPT R and U+AB6D LATIN CAPITAL LETTER SCRIPT R WITH RING as described in L2/24-243. [Ref: 2.3 in L2/24-228]

[185-C40] Consensus: UTC accepts for encoding in Unicode 18.0 the following 321 Arabic, Armenian, Bengali, Cuneiform, Devanagari, Hebrew, Kana, Khitan, Latin, Mongolian, Phonetic and other symbol characters for which code points have previously been assigned:

  1. Arabic (39 characters—ref. 180-C22, 180-C26): 10EC9..10ECF, 10ED9..10EEE, 10EF0..10EF9
  2. Armenian (3 characters—ref. 179-C46): 0558, 058B..058C
  3. Bengali (1 character—ref. 180-C30): 0984
  4. Cuneiform numerals (12 characters—ref. 182-C3): 1246F, 12475..1247F
  5. Devanagari (1 character—ref. 182-C5): 11B0A
  6. Hebrew (1 character—ref. 182-C4): 05C8
  7. Kana (7 characters—ref. 180-C6, 182-C31, 183-C54, 184-C38): 1B123..1B125, 1B126, 1B127..1B128, 1B168
  8. Khitan (5 characters—ref. 184-C5): 18CD6..18CDA
  9. Latin (54 characters—ref. 181-C8, 181-C10, 182-C6, 182-C7, 182-C8, 182-C9, 183-C8): 2E60..2E63, A7DD, A7E2, AB6C..AB6D, 1DF57..1DF59, 1DF5A..1DF66, 1DF67, 1DF68..1DF81, 1DFCD..1DFCF
  10. Mongolian (1 character—ref. 178-C30): 1879
  11. Phonetic (114 characters—ref. 179-C55, 179-C59, 179-C60, 180-C32, 180-C33, 180-C34, 180-C35, 180-C36, 180-C37, 181-C33, 181-C34, 181-C35, 181-C36, 181-C45, 183-C10): 1ADE..1ADF, 1AEC..1AF0, 208F, 209D..209F, 107BB..107BF, 1DF1F..1DF24, 1DF2B..1DF2C, 1DF2D..1DF3A, 1DF3B..1DF3D, 1DF3E..1DF3F, 1DF40..1DF56, 1DFD0, 1DFD1, 1DFD2..1DFD7, 1DFD8..1DFE8, 1DFE9..1DFF2, 1DFF3..1DFF4, 1DFF5..1DFF9, 1DFFA..1DFFF
  12. Symbols (81 characters—ref. 178-C31, 178-C36, 178-C37, 180-C38, 180-C39, 180-C40, 181-C38, 181-C39, 181-C40, 182-C10, 182-C11, 183-C12, 183-C13, 184-C18): 20C2, 1CEF1..1CEF5, 1D127..1D128, 1D1EB..1D1F6, 1D1F7..1D1FE, 1D1FF, 1D250..1D255, 1D256..1D25A, 1D25B..1D25F, 1D260, 1D261, 1D262..1D27F, 1D280..1D281, 1F1AE, 1F7DA
  13. Tangut (2 characters—ref. 183-C7, 184-C4: 18D1F..18D20

@eggrobin eggrobin requested a review from markusicu November 28, 2025 16:39
@eggrobin eggrobin marked this pull request as ready for review November 28, 2025 16:40
Comment on lines +12 to +15
Propertywise [\x{AB4B} ꭋ \N{LATIN SMALL LETTER SCRIPT R} \x{AB4C} ꭌ \N{LATIN SMALL LETTER SCRIPT R WITH RING}]
: [\x{AB6C} \N{LATIN CAPITAL LETTER SCRIPT R} \x{AB6D} \N{LATIN CAPITAL LETTER SCRIPT R WITH RING}]
CorrespondTo [\x{019B} ƛ \N{LATIN SMALL LETTER LAMBDA WITH STROKE}]
: [\x{A7DC} Ƛ \N{LATIN CAPITAL LETTER LAMBDA WITH STROKE}]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this testing the existing lowercase letters, with exception-value mappings to the new ones?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is testing the properties of both lowercase and uppercase characters; they are in the :-separated part, not in the -separated part, and PropertyWise X : Y CorrespondTo Z : T is equivalent to PropertyWise Y : X CorrespondTo T : Z.

Which means that we are testing both that the new uppercase letters have the right properties, and that the old lowercase letters now have as their uppercase_mapping (and whatever other properties map U+019B to U+A7DC, so titlecase_mapping and the simple mappings) the new ones.

Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise lgtm

@eggrobin eggrobin merged commit b6cb648 into unicode-org:main Nov 28, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants