Capital ꭋꭌ #945

eggrobin · 2024-10-04T01:32:00Z

[UTC-181-C10] Consensus: Provisionally assign 2 code points U+AB6C LATIN CAPITAL LETTER SCRIPT R and U+AB6D LATIN CAPITAL LETTER SCRIPT R WITH RING as described in L2/24-243. [Ref: 2.3 in L2/24-228]

[185-C40] Consensus: UTC accepts for encoding in Unicode 18.0 the following 321 Arabic, Armenian, Bengali, Cuneiform, Devanagari, Hebrew, Kana, Khitan, Latin, Mongolian, Phonetic and other symbol characters for which code points have previously been assigned:

Arabic (39 characters—ref. 180-C22, 180-C26): 10EC9..10ECF, 10ED9..10EEE, 10EF0..10EF9
Armenian (3 characters—ref. 179-C46): 0558, 058B..058C
Bengali (1 character—ref. 180-C30): 0984
Cuneiform numerals (12 characters—ref. 182-C3): 1246F, 12475..1247F
Devanagari (1 character—ref. 182-C5): 11B0A
Hebrew (1 character—ref. 182-C4): 05C8
Kana (7 characters—ref. 180-C6, 182-C31, 183-C54, 184-C38): 1B123..1B125, 1B126, 1B127..1B128, 1B168
Khitan (5 characters—ref. 184-C5): 18CD6..18CDA
Latin (54 characters—ref. 181-C8, 181-C10, 182-C6, 182-C7, 182-C8, 182-C9, 183-C8): 2E60..2E63, A7DD, A7E2, AB6C..AB6D, 1DF57..1DF59, 1DF5A..1DF66, 1DF67, 1DF68..1DF81, 1DFCD..1DFCF
Mongolian (1 character—ref. 178-C30): 1879
Phonetic (114 characters—ref. 179-C55, 179-C59, 179-C60, 180-C32, 180-C33, 180-C34, 180-C35, 180-C36, 180-C37, 181-C33, 181-C34, 181-C35, 181-C36, 181-C45, 183-C10): 1ADE..1ADF, 1AEC..1AF0, 208F, 209D..209F, 107BB..107BF, 1DF1F..1DF24, 1DF2B..1DF2C, 1DF2D..1DF3A, 1DF3B..1DF3D, 1DF3E..1DF3F, 1DF40..1DF56, 1DFD0, 1DFD1, 1DFD2..1DFD7, 1DFD8..1DFE8, 1DFE9..1DFF2, 1DFF3..1DFF4, 1DFF5..1DFF9, 1DFFA..1DFFF
Symbols (81 characters—ref. 178-C31, 178-C36, 178-C37, 180-C38, 180-C39, 180-C40, 181-C38, 181-C39, 181-C40, 182-C10, 182-C11, 183-C12, 183-C13, 184-C18): 20C2, 1CEF1..1CEF5, 1D127..1D128, 1D1EB..1D1F6, 1D1F7..1D1FE, 1D1FF, 1D250..1D255, 1D256..1D25A, 1D25B..1D25F, 1D260, 1D261, 1D262..1D27F, 1D280..1D281, 1F1AE, 1F7DA
Tangut (2 characters—ref. 183-C7, 184-C4: 18D1F..18D20

markusicu · 2025-11-28T16:46:58Z

unicodetools/src/main/resources/org/unicode/text/UCD/AdditionComparisons/148.txt

+Propertywise [\x{AB4B} ꭋ \N{LATIN SMALL LETTER SCRIPT R}   \x{AB4C} ꭌ \N{LATIN SMALL LETTER SCRIPT R WITH RING}]
+           : [\x{AB6C}   \N{LATIN CAPITAL LETTER SCRIPT R} \x{AB6D}   \N{LATIN CAPITAL LETTER SCRIPT R WITH RING}]
+CorrespondTo [\x{019B} ƛ \N{LATIN SMALL LETTER LAMBDA WITH STROKE}]
+           : [\x{A7DC} Ƛ \N{LATIN CAPITAL LETTER LAMBDA WITH STROKE}]


isn't this testing the existing lowercase letters, with exception-value mappings to the new ones?

It is testing the properties of both lowercase and uppercase characters; they are in the :-separated part, not in the ⧴-separated part, and PropertyWise X : Y CorrespondTo Z : T is equivalent to PropertyWise Y : X CorrespondTo T : Z.

Which means that we are testing both that the new uppercase letters have the right properties, and that the old lowercase letters now have as their uppercase_mapping (and whatever other properties map U+019B to U+A7DC, so titlecase_mapping and the simple mappings) the new ones.

markusicu

otherwise lgtm

eggrobin added 6 commits October 4, 2024 00:35

UnicodeData.txt lines from the proposal

0cc0abd

Bad uppercase mapping

60cb8bc

lb=AL

0ecab6f

Latin

0ded1a8

Regenerate UCD

bb88c49

test

ecc68ad

eggrobin added data-for-new pipeline-recommended-to-UTC labels Oct 4, 2024

Merge remote-tracking branch 'la-vache/main' into capital-ꭋꭌ

dcdbe34

eggrobin added pipeline-provisionally-assigned and removed pipeline-recommended-to-UTC labels Nov 6, 2024

markusicu added pipeline-18.0 and removed pipeline-provisionally-assigned labels Nov 24, 2025

eggrobin added 2 commits November 28, 2025 17:39

Merge remote-tracking branch 'la-vache/main' into capital-ꭋꭌ

3f2fb9f

Ignore IDNA2008_Category

ce8961d

eggrobin requested a review from markusicu November 28, 2025 16:39

eggrobin marked this pull request as ready for review November 28, 2025 16:40

markusicu reviewed Nov 28, 2025

View reviewed changes

markusicu approved these changes Nov 28, 2025

View reviewed changes

eggrobin merged commit b6cb648 into unicode-org:main Nov 28, 2025
23 of 24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Capital ꭋꭌ #945

Capital ꭋꭌ #945

Uh oh!

eggrobin commented Oct 4, 2024 •

edited

Loading

Uh oh!

markusicu Nov 28, 2025

Uh oh!

eggrobin Nov 28, 2025

Uh oh!

markusicu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Capital ꭋꭌ #945

Capital ꭋꭌ #945

Uh oh!

Conversation

eggrobin commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markusicu Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

eggrobin Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

markusicu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eggrobin commented Oct 4, 2024 •

edited

Loading