Skip to content

Add word-break setting to boundary analysis #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 25, 2025

Conversation

valadaptive
Copy link
Contributor

This is necessary for implementing linebender/parley#303.

In the long term, Parley should move to some other Unicode analysis crate, but this should do for now.

@valadaptive
Copy link
Contributor Author

Actually, let me see if there's a way to do this per-cluster

@valadaptive valadaptive force-pushed the word-break-setting branch 3 times, most recently from 3053aa4 to 83d62cc Compare March 25, 2025 15:32
@valadaptive
Copy link
Contributor Author

I've got an implementation that seems to work for Parley, but I suspect may still be incorrect.

The API is not very elegant--the iterator returned by text::analyze now has a set_break_strength method. This makes iteration more awkward if you choose to use it, but has the benefit of not breaking backwards compatibility.

I'm not sure what the extra entries in the pair table are for, but they seem to carry some extra state. For implementing break-all, which requires characters to be treated as ID if they're AL | NU | SA, I'm checking if the previous character and current character are AL | NU | SA, and treating each as ID for the purposes of the current line break opportunity only. I'm not sure how this interacts with the extra pair table entries, which is why I say this may be incorrect.

Copy link
Collaborator

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me.

Copy link
Collaborator

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon looking more closely at the Parley PR I think it might make sense to consolidate the definitions of WordBreakStrength since they are otherwise the same.

@valadaptive
Copy link
Contributor Author

I've updated the definition of WordBreakStrength here so it can be used in Parley.

@xorgy xorgy merged commit 63e57ba into dfrg:main Mar 25, 2025
10 checks passed
@xorgy xorgy added this to the v0.3.0 milestone Mar 26, 2025
github-merge-queue bot pushed a commit to linebender/parley that referenced this pull request Apr 3, 2025
Requires upstream support in Swash
(dfrg/swash#89).

This implements the CSS `word-break` and `overflow-wrap` style
properties (including the latter's influence on the layout's min-content
size).

This replaces the "emergency break" logic with something that should be
simpler.

Different browsers seem to disagree on exactly which ranges of text the
properties apply to. The behavior implemented here should be reasonable.

In the table below, red is the default wrapping settings, blue is
`overflow-wrap: anywhere`, and green is `word-break: break-all`:

| Gecko | Blink | WebKit | Parley |
| - | - | - | - |

![image](https://github.com/user-attachments/assets/044339f7-2685-42c7-bd83-a20ef48ad9a6)
|
![image](https://github.com/user-attachments/assets/3d5613cd-0393-471e-97f5-590ecaffb7c4)
|
![image](https://github.com/user-attachments/assets/31102883-0081-4788-a5f0-b13413434d1d)
|
![image](https://github.com/user-attachments/assets/a8f68cb7-579f-442a-a60f-37aacc794b42)

![image](https://github.com/user-attachments/assets/355be2fd-8c12-466a-ac10-c87a7ed76e3c)
|
![image](https://github.com/user-attachments/assets/0c28b246-6048-43e5-9b93-861e38a21e62)
|
![image](https://github.com/user-attachments/assets/661f362c-ecff-47d4-894e-c1508b9ddd0f)
|
![image](https://github.com/user-attachments/assets/fac5d47f-ba24-4611-9b80-6c39528d34bf)

![image](https://github.com/user-attachments/assets/1a8921cf-70ff-4b77-abc5-648f5bc95b76)
|
![image](https://github.com/user-attachments/assets/f9b6b79a-fd01-4402-b3e7-ec5fe97250e7)
|
![image](https://github.com/user-attachments/assets/7d5f81b1-c602-4f64-b328-0a3e429ea327)
|
![image](https://github.com/user-attachments/assets/fef979e8-b55a-46ca-b074-dd7649c06316)

![image](https://github.com/user-attachments/assets/e927b6f9-a993-4085-a695-59f0a4001639)|
![image](https://github.com/user-attachments/assets/38c548ee-45a7-474f-91b4-3e95ebc36060)
|
![image](https://github.com/user-attachments/assets/d052f5ac-5d08-4fe5-abf8-4831dafe974a)
|
![image](https://github.com/user-attachments/assets/6e4984b0-58c7-4bb8-84b9-dbbad1414306)

![image](https://github.com/user-attachments/assets/f2b6e871-acc6-4396-ab6f-34e92eea38d2)
|
![image](https://github.com/user-attachments/assets/4069c51c-4b0b-4fe6-aa45-cce26e4603be)
|
![image](https://github.com/user-attachments/assets/5f4ecd2b-0adb-4ddb-b7a5-4980ec4e40d0)
|
![image](https://github.com/user-attachments/assets/c2b750a7-f92e-4b67-b540-91c02c9745e2)

![image](https://github.com/user-attachments/assets/959ac5c3-bd67-4075-bb58-a35501a7b378)
|
![image](https://github.com/user-attachments/assets/6b105308-dbe8-4049-9bc7-4fb10662d951)
|
![image](https://github.com/user-attachments/assets/b2e9d1cf-3601-4232-8304-80123db546b7)
|
![image](https://github.com/user-attachments/assets/3a717e75-7c3a-48d5-97cd-6fb27574de69)
\*

![image](https://github.com/user-attachments/assets/ad92d9d1-625c-4722-9ffe-d696b7559a7e)
|
![image](https://github.com/user-attachments/assets/d3bbfb25-52f1-4785-b972-104889534ece)
|
![image](https://github.com/user-attachments/assets/9a16f48e-9208-4140-8f57-a224103e60fd)
|
![word_break_break_all_second_half-0](https://github.com/user-attachments/assets/0b59508d-e38f-4a8d-99f3-bfa46caf9d8b)

\* *This doesn't match browsers, but the behavior is officially
unspecified per w3c/csswg-drafts#3897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants