-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add identifier syntax #296
base: persisted-documents
Are you sure you want to change the base?
Add identifier syntax #296
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps only use quotes around strings and characters?
Co-authored-by: Shane Krueger <[email protected]>
It's interesting you should raise this now; I was working on Friday on a reworking of the persisted documents spec that relates to using the I present this only for comparison, in case it's helpful. |
|
||
IdentifierCharacter :: SourceCharacter but not `:` | ||
|
||
SourceCharacter :: Any Unicode scalar value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incredibly broad; I know that's effectively what we have currently but I feel like we should lock it down a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we could allow all Unicode characters, the main spec defines a specific set of characters for use in that document. Something similar could be done here.
SourceCharacter
U+0009
U+000A
U+000D
U+0020–U+FFFF
On a side note, the spec is defining these characters as UTF-16 codes, not Unicode code points.
I’m fine with allowing any Unicode character in this context, or any non control character would be fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had only alphanumericals initially and then backtracked to "any unicode" because "why not". But I think your point about having the identifier in url path is the important one. I'll move to alphanumerical + a bunch of unreserved chars like you did here. Questions:
- do we even need
~
,.
,_
,-
? - do we need uppercase letters? This might cause issues on case-insensitive filesystems (hello macOS 👋 )
Edit: updated in 5d48e0e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly recommend a wide range of characters. URLs can escape any Unicode character so there need be no limit on what characters can be used for document ids. Also, url paths are by definition case sensitive; but they may be treated with no case sensitivity. For instance, if the document id was a guid it would need dashes. If it was base64 it would need lowercase and uppercase and a couple extra characters. If it was readable names, it would need to support Unicode for foreign languages. And a custom implementation may decide to treat the document id with no case sensitivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the document id was a guid it would need dashes
Excellent point.
If it was base64 it would need lowercase and uppercase and a couple extra characters
Another excellent point. +
, /
and =
would be needed too. Which would make them a bit awkard to use as url path segments... Interesting!
If it was readable names, it would need to support Unicode for foreign languages
I think I disagree on this one. I don't really see anyone using readable names as identifiers? Sounds like a footgun to me that if we can, we should discourage. Just like the relay cursors are encouraged to be opaque, I would encourage the document ids to be similarly opaque.
Tangential: I have a GraphQL-over-HTTP meeting in my calendar on Thursday. Is that still happening? Want to discuss over there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch @Shane32; :
sits in that grey area where you don't strictly need to escape it, but it's best practice to do so. We should change that to a URL-safe character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the !
suggestion best I think.
I think we can change the spec text to that without making existing :
usage non-compliant - it just becomes a custom ID rather than prefixed ID.
Looks like !
gets encoded to %21
via new URLSearchParams({"documentId":"sha256!abcdef"}).toString()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*
seems to work with both URLSearchParams
and encodeURIComponent
, and I don't think it's normally used in encodings so should be safe... I'll need to look up in the relevant specs to see if it's not meant to be used for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec for URLSearchParams:
The application/x-www-form-urlencoded percent-encode set contains all code points, except the ASCII alphanumeric, U+002A (*), U+002D (-), U+002E (.), and U+005F (_).
So ~
is probably encoded also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, yeah. Actually I remember I deliberately chose a character outside of unreserved
in RFC3986 for the separator, so I'm okay with :
being URL encoded. But *
might be a better choice to avoid it.
GitHub renders the URLs okay:
- blah blah https://example.com/graphql/p/sha256:abc123 blah blah
- blah blah https://example.com/graphql/p/sha256*abc123 blah blah
- move `x-` and `sha256` to validation - allow `:` in prefixed identifier payloads - add link to RFC3986
See #264 (comment)