Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior of WordWrap if there is a "/" symbol in the middle of a word. #448

Open
4 tasks done
IshmaZX82 opened this issue Feb 19, 2025 · 2 comments
Open
4 tasks done

Comments

@IshmaZX82
Copy link

Prerequisites

  • I have written a descriptive issue title
  • I have verified that I am running the latest version of Fonts
  • I have verified if the problem exist in both DEBUG and RELEASE mode
  • I have searched open and closed issues to ensure it has not already been reported

Fonts version

2.1.2

Other Six Labors packages and versions

SixLabors.ImageSharp.Drawing 2.1.5, SixLabors.ImageSharp 3.1.6

Environment (Operating system, version and so on)

Windows 11

.NET Framework version

.Net 8

Description

I noticed strange behavior of WordWrap if there is a "/" symbol in the middle of a word.
For example, if you output the text string "aaaaa bbbbb/ccccc ddddd" with a width limitation.
The picture on the left shows the expected behavior (as in GDI, GDI+, Wpf, Word, Chrome, etc.), on the right is the result obtained:

Image

Note: if there is only one long word in the line, in the middle of which there is a "/" symbol, then this is the correct behavior.
But if there is more than one word, then it should be wrapped entirely.

Steps to Reproduce

Draw text string "aaaaa bbbbb/ccccc ddddd" with a width limitation.

Images

No response

@JimBobSquarePants
Copy link
Member

JimBobSquarePants commented Feb 26, 2025

OK. This is an interesting one...

In Unicode, a slash (/) is a breaking opportunity in most contexts. According to the Unicode Line Breaking Algorithm (UAX 14):

  • The slash (U+002F) has the line breaking class SY (Break Symbols).
  • By default, SY allows a break before or after it, but additional rules (such as LB23) prevent breaks in common cases like a/b unless explicitly allowed.

Key Line Break Rules Affecting /:

  1. LB23 (Slash Rule)

    • "Do not break before or after / in numeric contexts."
    • Example: 123/456 does not break.
  2. LB25 (SY handling)

    • "Treat / as providing a line break opportunity, except in numeric contexts."
    • Example:
      • word/word → May break.
      • 123/456 → No break.
  3. LB22 (Exceptions with IN class)

    • When a slash is between two ideographic characters (East Asian text), breaking is often allowed.

Practical Examples:

Text Break Allowed?
hello/world ✅ Possible
123/456 ❌ No break
分/字 ✅ Possible
path/to/file ✅ Possible

Conclusion:

A slash (/) is a potential breaking opportunity in Unicode, but not in numeric contexts. Whether it actually breaks depends on additional text properties and locale-specific rules.

So.... I'm actually doing the correct thing according to the specification, however....

According to the specification there's a recommendation for Slash due to the prevalence of URLs in text.
https://www.unicode.org/reports/tr14/#SY

URLs are now so common in regular plain text that they need to be taken into account when assigning general-purpose line breaking properties. Slash (solidus) is allowed as an additional, limited break opportunity to improve layout of Web addresses. As a side effect, some common abbreviations such as “w/o” or “A/S”, which normally would not be broken, acquire a line break opportunity. The recommendation in this case is for the layout system not to utilize a line break opportunity allowed by SY unless the distance between it and the next line break opportunity exceeds an implementation-defined minimal distance.

That's why other implementation are rendering differently. (GDI, GDI+, Wpf, Word will all be using the same implementation underneath)

Do I change it.... probably but it's low down on my priority list just now.

@JimBobSquarePants
Copy link
Member

@IshmaZX82 The fix might be as simple as the following but I won't have time just now to test this. If you have the opportunity, please let me know.

public bool TrySplitAt(LineBreak lineBreak, bool keepAll, [NotNullWhen(true)] out TextLine? result)
{
    int index = this.data.Count;
    GlyphLayoutData glyphData = default;
    while (index > 0)
    {
        glyphData = this.data[--index];

        // URLs are now so common in regular plain text that they need to be taken into account when
        // assigning general-purpose line breaking properties.
        // Chrome and others appear to simply ignore the slash character.
        if (glyphData.CodePointIndex == lineBreak.PositionWrap && glyphData.CodePoint.Value != 0x002F)
        {
            break;
        }
    }

    // Word breaks should not be used for Chinese/Japanese/Korean (CJK) text
    // when word-breaking mode is keep-all.
    if (index > 0
        && !lineBreak.Required
        && keepAll
        && UnicodeUtility.IsCJKCodePoint((uint)glyphData.CodePoint.Value))
    {
        // Loop through previous glyphs to see if there is
        // a non CJK codepoint we can break at.
        while (index > 0)
        {
            glyphData = this.data[--index];
            if (!UnicodeUtility.IsCJKCodePoint((uint)glyphData.CodePoint.Value))
            {
                index++;
                break;
            }
        }
    }

    if (index == 0)
    {
        result = null;
        return false;
    }

    // Create a new line ensuring we capture the initial metrics.
    int count = this.data.Count - index;
    result = new(count);
    result.data.AddRange(this.data.GetRange(index, count));
    RecalculateLineMetrics(result);

    // Remove those items from this line.
    this.data.RemoveRange(index, count);
    RecalculateLineMetrics(this);

    return true;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants