-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix invalid urlpattern test(s) #49782
Conversation
There are no reviewers for this pull request. Please reach out on the chat room to get help with this. Thank you! |
cc @sisidovski |
Aside: are there no tests where port is assigned to something that is clearly not a port? |
d66cd3c
to
1e055be
Compare
@annevk I'm not sure. By the way, port is not the only wrong test here. As I progress with Ada's implementation, it seems |
1e055be
to
68ffecf
Compare
68ffecf
to
153c3e9
Compare
@annevk I've added a new commit that includes a test for a failing port field. |
See whatwg/urlpattern#239 (comment) - I am not entirely sure your analysis for port is correct. |
It seems that concern was resolved. @sisidovski and @jeremyroman can you please review this? |
}, | ||
{ | ||
"pattern": [{ "protocol": "http", "port": "100000" }], | ||
"inputs": [{ "protocol": "http", "port": "100000" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow what the intention of this test case is. Could you help me to understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. It tests the maximum valid port number. If it is greater than 65k, it should throw (according to url spec)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is for the step 2.1.2 in the port state in the basic-url-parser. That makes sense.
}, | ||
{ | ||
"pattern": [{ "protocol": "http", "port": "100000" }], | ||
"inputs": [{ "protocol": "http", "port": "100000" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is for the step 2.1.2 in the port state in the basic-url-parser. That makes sense.
@@ -2419,15 +2436,24 @@ | |||
}, | |||
{ | |||
"pattern": [{ "hostname": "bad\nhostname" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are \n
, \r
, and \t
just stripped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is defined in URL spec. All ascii tab or newline are removed from the input before any parsing is done.
Remove all ASCII tab or newline from input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, bad\nhostname
is passed as input to basic url parser, and those are removed in the step 3.
@@ -2367,15 +2375,24 @@ | |||
}, | |||
{ | |||
"pattern": [{ "hostname": "bad#hostname" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify why only #%/
don't throw errors and other hostname patterns like "bad\:hostname" or "bad>hostname" continue throw errors?
btw I noticed "bad\:hostname" updates hostname in URL.
u = new URL('http:/dummy.site')
u.hostname = "bad\\:hostname"
u.hostname // bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Host parser (specifically domain to ASCII with domain and false) strip all trailing values whenever it sees # https://url.spec.whatwg.org/#concept-host-parser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host parser does not do that. Do you mean the host
setter or some such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host parser does not do that. Do you mean the
host
setter or some such?
Yes, I meant the host setter, which calls the host parser.
Could you clarify why only #%/ don't throw errors and other hostname patterns like "bad:hostname" or "bad>hostname" continue throw errors?
@sisidovski Sorry I missed this comment. URL spec hostname state step 3 (ref: https://url.spec.whatwg.org/#hostname-state) says that...
if c is the EOF code point, U+002F (/), U+003F (?), or U+0023 (#) ... then decrease pointer by 1, and then: ... Let host be the result of host parsing buffer with url is not special.
So, if you see these characters, we take the existing non-empty buffer and parse it as a valid hostname.
@annevk @sisidovski would you mind taking one last look at this? I've tried to address your questions and i think this is ready to land. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks!
@@ -2419,15 +2436,24 @@ | |||
}, | |||
{ | |||
"pattern": [{ "hostname": "bad\nhostname" }], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, bad\nhostname
is passed as input to basic url parser, and those are removed in the step 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rubber-stamp so this can land. @sisidovski you should ask on the web-platform-tests Matrix chat for contributor access.
Fixes whatwg/urlpattern#239
Port calls canonicalizePort which just calls url parser setter, which removes the leading trailing c0 whitespace characters from the input.
cc @annevk @lucacasonato @sisidovski