Skip to content

Commit

Permalink
Track recent changes to Unicode support in upstream spec
Browse files Browse the repository at this point in the history
  • Loading branch information
swolchok committed Sep 28, 2015
1 parent c35f572 commit fcfb9bf
Show file tree
Hide file tree
Showing 4 changed files with 547 additions and 308 deletions.
Loading

4 comments on commit fcfb9bf

@leebyron
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great.

Will comments consume multi-byte characters correctly as well? For example, what happens if a http://www.fileformat.info/info/unicode/char/0A0A/index.htm occurs within a comment?

@leebyron
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the same test to graphql-js right now graphql/graphql-js@da5c4b0

@swolchok
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will comments consume multi-byte characters correctly as well? For example, what happens if a http://www.fileformat.info/info/unicode/char/0A0A/index.htm occurs within a comment?

The UTF-8 encoding of \u0a0a is the 3-byte sequence e0 a8 8a, so there's no problem there. There is also no problem with 0a ever being embedded within a valid UTF-8 byte sequence because UTF-8 is designed not to overlap with valid ASCII bytes -- UTF-8 encodings for codepoints above 7F all have the high bit set!

@leebyron
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I forgot this explicitly parses a UTF8 stream :) Most excellent

Please sign in to comment.