Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support BiDirectional text #1665

Open
XVilka opened this issue Jul 31, 2018 · 8 comments
Open

Support BiDirectional text #1665

XVilka opened this issue Jul 31, 2018 · 8 comments

Comments

@Maximus5
Copy link
Owner

I know the problem, but not being any of RTL language speaker, it's hard to read and understand proper outline and direction.

It would help a lot, if you may create reliable test and the screenshots of expected and wrong output.
For example, simple "hello world" from any programming language.

Also, it's not clear to me, what is expected behavior on English text containing parts of RTL.
It would help if you may provide a program which may be used as etalon.
I saw the Konsole in your gist, but it's rather hard to test and compare.

@XVilka
Copy link
Author

XVilka commented Jul 31, 2018

OK, will do.

@karliss
Copy link

karliss commented Aug 7, 2018

It seems that the way unicode describes bidirectional text printing characters in correct order requires knowing full text (chunk of text) in advance. That is not really practical for terminal. Printing characters incrementally can change positions of previously printed characters.

From the look of https://www.arabeyes.org/ArabeyesTodo#Terminal_Emulators no one really knows how bidirectional text should interact with various terminal control characters like moving/reading cursor position, clearing until end of line and others. What happens you replace character in the middle of previously printed text?

@Maximus5
Copy link
Owner

Maximus5 commented Aug 7, 2018

Exactly.
Well, terminal may treat each line of output as text chunk (which may be incorrect in editors like Vim), and use bimap from physical character index in the line with its position onscreen. But I don't know how home/end/clear must behave.
Also, there may be a problem with formatting: mc, far manager, ls/dir, etc.

@karliss
Copy link

karliss commented Sep 1, 2018

Turns out ECMA-48 standard has a little bit of information about how bi-directional text should interact with terminal control sequences. http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf
Haven't read it carefully so I can't comment if it's usable, implementable and how well it matches the bi-directional text logic that was added to unicode after it was written.

@faridcher
Copy link

faridcher commented Oct 28, 2018

bidirectional console (BiCon) might be relevant here
https://github.com/behdad/bicon
Konsole and mlterm are among the terminals that support BiDirectional languages.
http://mlterm.sourceforge.net

@XVilka
Copy link
Author

XVilka commented Jul 2, 2019

Just an update - new console BiDi specification was recently implemented in libvte by @egmontkob: https://terminal-wg.pages.freedesktop.org/bidi/implementations.html#vte

See also the issue in the new Windows Terminal microsoft/terminal#538

@ilius
Copy link

ilius commented Feb 4, 2023

Thank you for this project. It's specially nice that it can properly render Arabic words.

(when I say Arabic it also applies to Persian, Urdu and other languages that use Arabic script)

If one or more Arabic words or phrases have a different color, or uses any ANSI formatting, the order of words in that paragraph become messed up.
For example if the logical order is [phrase1] [phrase2] [phrase3] and all phrases being Arabic words, then it will shows as [phrase3] [phrase2] [phrase1] which is correct because Arabic is RTL (Right-to-Left). But if phrase2 has a color/formatting, then it will be shown as [phrase1] [phrase2] [phrase3] which is Left-to-Right. This will be much worse if these phrase have multiple words, because each phrase (within the same formatting block) is still RTL, so you can't even read the whole paragraph from Left-to-Right (let alone normally) and it becomes unreadable.

I suggest you try to disable existing RTL support if any formatting exists in a paragraph, so words are shown left to right, just like Git Bash/Mintty or windows cmd.exe. This will make it much more bearable.

You can use Python to reproduce: (and compare it with Git Bash)

>>> red = f"\x1b[38;5;1m"
>>> reset = "\x1b[0;0;0m"

>>> words = [f"کلمه{i+1}" for i in range(3)]

>>> print(" ".join(words))

>>> print(words[0] + " " + red + words[1] + reset + " " + words[2])

You can see both ConEmu and Git Bash (Mintty) here:

ConEmu-arabic-bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants