Skip to content

Commit

Permalink
Specify URL equivalence modulo search variance
Browse files Browse the repository at this point in the history
Closes #201.
  • Loading branch information
domenic authored Dec 6, 2022
1 parent 05228fe commit 89fd061
Showing 1 changed file with 129 additions and 4 deletions.
133 changes: 129 additions & 4 deletions no-vary-search.bs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,24 @@ spec: RFC8941; urlPrefix: https://www.rfc-editor.org/rfc/rfc8941.html
text: boolean; url: name-boolean
text: inner list; url: name-inner-lists
</pre>
<style>
#example-equivalence-canonicalization table {
border-collapse: collapse;
}

#example-equivalence-canonicalization table :is(td, th):first-of-type {
border-right: 1px solid black;
padding-right: 20px;
}

#example-equivalence-canonicalization table :is(td, th):nth-of-type(2) {
padding-left: 5px;
}

#example-equivalence-canonicalization table tr.group {
border-top: 1px solid black;
}
</style>

<h2 id="status-and-venue">Status and venue note</h2>

Expand Down Expand Up @@ -102,8 +120,9 @@ The [=obtain a URL search variance=] algorithm ensures that all [=URL search var

<table class="data">
<thead>
<th>Input</th>
<th>Result</th>
<tr>
<th>Input</th>
<th>Result</th>
<tbody>
<tr>
<td><pre highlight="http">No-Vary-Search: params</pre>
Expand Down Expand Up @@ -145,8 +164,9 @@ The [=obtain a URL search variance=] algorithm ensures that all [=URL search var

<table>
<thead>
<th>Input
<th>Conventional form
<tr>
<th>Input
<th>Conventional form
<tbody>
<tr>
<td><pre highlight="http">No-Vary-Search: params=?1</pre>
Expand All @@ -168,3 +188,108 @@ The [=obtain a URL search variance=] algorithm ensures that all [=URL search var
<td>(omit the header)
</table>
</div>

<h2 id="comparing">Comparing</h2>

Two [=URLs=] |urlA| and |urlB| are <dfn export>equivalent modulo search variance</dfn> given a [=URL search variance=] |searchVariance| if the following algorithm returns true:

1. If the [=url/scheme=], [=url/username=], [=url/password=], [=url/host=], [=url/port=], or [=url/path=] of |urlA| and |urlB| differ, then return false.

1. If |searchVariance| is equivalent to the [=default URL search variance=], then:

1. If |urlA|'s [=url/query=] equals |urlB|'s [=url/query=], then return true.

1. Return false.

<p class="note">In this case, even [=URL=] pairs that might appear the same after running the [=urlencoded parser|application/x-www-form-urlencoded parser=] on their [=url/queries=], such as `https://example.com/a` and `https://example.com/a?`, or `https://example.com/foo?a=b&&&c` and `https://example.com/foo?a=b&c=`, will be treated as inequivalent.

1. Let |searchParamsA| and |searchParamsB| be empty [=lists=].

1. If |urlA|'s [=url/query=] is not null, then set |searchParamsA| to the result of running the [=urlencoded parser|application/x-www-form-urlencoded parser=] given the [=isomorphic encoding=] of |urlA|'s [=url/query=].

1. If |urlB|'s [=url/query=] is not null, then set |searchParamsB| to the result of running the [=urlencoded parser|application/x-www-form-urlencoded parser=] given the [=isomorphic encoding=] of |urlB|'s [=url/query=].

1. If |searchVariance|'s [=URL search variance/no-vary params=] is a [=list=], then:

1. Set |searchParamsA| to a [=list=] containing those [=list/items=] |pair| in |searchParamsA| where |searchVariance|'s [=URL search variance/no-vary params=] does not [=list/contain=] |pair|[0].

1. Set |searchParamsB| to a [=list=] containing those [=list/items=] |pair| in |searchParamsB| where |searchVariance|'s [=URL search variance/no-vary params=] does not [=list/contain=] |pair|[0].

1. Otherwise, if |searchVariance|'s [=URL search variance/vary params=] is a [=list=], then:

1. Set |searchParamsA| to a [=list=] containing those [=list/items=] |pair| in |searchParamsA| where |searchVariance|'s [=URL search variance/vary params=] [=list/contains=] |pair|[0].

1. Set |searchParamsB| to a [=list=] containing those [=list/items=] |pair| in |searchParamsB| where |searchVariance|'s [=URL search variance/vary params=] [=list/contains=] |pair|[0].

1. If |searchVariance|'s [=URL search variance/vary on key order=] is false, then:

1. Let |keyLessThan| be an algorithm taking as inputs two pairs (|keyA|, <var ignore>valueA</var>) and (|keyB|, <var ignore>valueB</var>), which returns whether |keyA| is [=code unit less than=] |keyB|.

1. Set |searchParamsA| to the result of [=list/sorting in ascending order=] |searchParamsA|, with |keyLessThan|.

1. Set |searchParamsB| to the result of [=list/sorting in ascending order=] |searchParamsB|, with |keyLessThan|.

1. If |searchParamsA|'s [=list/size=] is not equal to |searchParamsB|'s [=list/size=], then return false.

1. Let |i| be 0.

1. [=iteration/While=] |i| &lt; |searchParamsA|'s [=list/size=]:

1. If |searchParamsA|[|i|][0] does not equal |searchParamsB|[|i|][0], then return false.

1. If |searchParamsA|[|i|][1] does not equal |searchParamsB|[|i|][1], then return false.

1. Set |i| to |i| + 1.

1. Return true.

<div class="example" id="example-equivalence-canonicalization">
Due to how the [=urlencoded parser|application/x-www-form-urlencoded parser=] canonicalizes query strings, there are some cases where query strings which do not appear obviously equivalent, will end up being treated as equivalent after parsing.

So, for example, given any non-default value for `No-Vary-Search`, such as `No-Vary-Search: key-order`, we will have the following equivalences:

<table>
<thead>
<tr>
<th>Equivalent URLs
<th>Explanation
<tbody>
<tr class="group">
<td>`https://example.com/`
<td rowspan=2>A null [=url/query=] is parsed the same as an empty string query
<tr>
<td>`https://example.com/?`
<tr class="group">
<td>`https://example.com/?a=x`
<td rowspan=2>Parsing performs percent-decoding
<tr>
<td>`https://example.com/?%61=%78`
<tr class="group">
<td>`https://example.com/?a=é`
<td rowspan=2>Parsing performs percent-decoding
<tr>
<td>`https://example.com/?a=%C3%A9`
<tr class="group">
<td>`https://example.com/?a=%f6`
<td rowspan=2>Both values are parsed as U+FFFD (�)
<tr>
<td>`https://example.com/?a=%ef%bf%bd`
<tr class="group">
<td>`https://example.com/?a=x&&&&`
<td rowspan=2>Parsing splits on `&` and discards empty strings
<tr>
<td>`https://example.com/?a=x`
<tr class="group">
<td>`https://example.com/?a=`
<td rowspan=2>Both parse as having an empty string value for `a`
<tr>
<td>`https://example.com/?a`
<tr class="group">
<td>`https://example.com/?a=%20`
<td rowspan=3>`+` and `%20` are both parsed as U+0020 SPACE
<tr>
<td>`https://example.com/?a=+`
<tr>
<td>`https://example.com/?a= &`
</table>
</div>

0 comments on commit 89fd061

Please sign in to comment.