Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tagged template literal with unicode #15047

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pfgithub
Copy link
Contributor

@pfgithub pfgithub commented Nov 8, 2024

What does this PR do?

Fixes #8745

console.log(String.raw`æ™`); // -> "æ™" instead of "\u00E6\u2122"

Changes ascii_only to prefers_ascii. It will try to emit mostly ascii, but if a non-ascii character is encountered in a tagged template, it will emit it. Updates execution to scan the file to see if it contains any non-ascii and if it does, load it as utf-8 instead.

This should be benchmarked to see what the performance cost is.

TODO:

  • Store the file as utf-16 in the runtime transpiler cache (caches large transpiled files)
  • Store the file as utf-16 in the standalone module graph (for bun --compile)
  • Figure out what to do about a comment containing a non-ascii characters. Should they mark the whole file as non-ascii, or should the non-ascii characters be removed from the output, or should the bytes be passed to JavascriptCore as latin-1 if it's only a comment? A non-ascii comment is probably much more likely than a non-ascii character within a tagged template literal.

@robobun
Copy link

robobun commented Nov 8, 2024

@pfgithub pfgithub changed the title Fix tagged template literal Fix tagged template literal with unicode Nov 8, 2024
@pfgithub pfgithub marked this pull request as ready for review November 8, 2024 20:14
@pfgithub pfgithub marked this pull request as draft November 9, 2024 02:08
@pfgithub pfgithub force-pushed the pfg/fix-tagged-template-literal branch from ea67f97 to 66eae13 Compare November 12, 2024 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raw tagged template literals show escapes for non ascii text
2 participants