From 5c8e92d3fed54311421b32a1a3ef7ae7bc75ff9c Mon Sep 17 00:00:00 2001 From: Lukasz Anforowicz Date: Mon, 22 Oct 2018 15:11:48 -0700 Subject: [PATCH 1/3] CORB confirmation sniffing for HTML, XML and JSON security prefix. --- mimesniff.bs | 558 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 557 insertions(+), 1 deletion(-) diff --git a/mimesniff.bs b/mimesniff.bs index 5c773c9..43f74d9 100644 --- a/mimesniff.bs +++ b/mimesniff.bs @@ -139,7 +139,7 @@ production. By definition it is a superset of the HTTP token code points.

A whitespace byte (abbreviated - 0xWS) is any one of the following + 1xWS) is any one of the following bytes: 0x09 (HT), 0x0A (LF), 0x0C (FF), 0x0D (CR), 0x20 (SP). @@ -2752,6 +2752,562 @@ type: +

Confirming the resource can be CORB-protected

+ +To confirm that the response can be CORB-protected, +user agents must use the following +CORB confirmation sniffing algorithm: + +
    +
  1. + If the no-sniff flag is set, + the CORB confirmation sniffing result is "protected". + + Abort these steps. + +
  2. + If the computed MIME type is a HTML MIME type, + the CORB confirmation sniffing result is "protected" if + CORB confirmation sniffing for HTML + algorithm returns "confirmed HTML". + + Otherwise the CORB confirmation sniffing result is "allowed". + + Abort these steps. + + + +
  3. + If the computed MIME type is a XML MIME type, + the CORB confirmation sniffing result is "protected" if + CORB confirmation sniffing for XML + algorithm returns "confirmed XML". + + Otherwise the CORB confirmation sniffing result is "allowed". + + Abort these steps. + +
  4. + Otherwise (if the no-sniff flag is not set and the computed MIME + type is not covered above) + the CORB confirmation sniffing result is "protected" if + JSON security prefix sniffing + algorithm returns "JSON security prefix is present". + (and is "allowed" otherwise). +
+ + +

CORB confirmation sniffing for HTML

+ + +
    +
  1. Continue executing the following steps (and advancing past whitespace and + comments) for as long as possible: +
      +
    1. Advance past whitespace bytes. +

      This step intentionally ignores some characters that are + considered to be whitespace + by Javascript, + but not + by HTML + (for example <NBSP> and/or <ZWNBSP>). These characters will be + dealt with in a later step and result in "maybe not HTML". + +

    2. Advance past combined HTML+Javascript comment. + If the next bytes are 3C 21 2D 2D (the "<!--" string), then: +
        +
      1. Find and advance past the sequence of 2D 2D 3E bytes (the "-->" string). +
      2. Find and advance past the first of the following byte sequences: +
          +
        • 0A (<LF> - new line) +
        • 0D (<CR> - carriage return) +
        • E2 20 28 (UTF8 encoding of <LS> - line separator) +
        • E2 20 29 (UTF8 encoding of <PS> - paragraph separator) +
        +

        The step above advances past characters that are between "-->" + and a Javascript line terminator, + because such characters are considered to be Javascript comments according to + the HTMLCloseComment rule. +

      +
    + +
  2. Attempt to match one of HTML signature patterns: +
      +
    1. Let patternMatched be the result of the pattern + matching algorithm given the remaining bytes from the + resource's resource header, the value in the first + column of row, the value in the second column of + row, and the value in the third column of row. + +
    2. If patternMatched is true, return "confirmed HTML". + Otherwise return "possibly not HTML". +
    +
+ +

The table below is the text/html-specific subset + of the table used for + identifying a resource with an unknown MIME type + (excluding the pattern covering HTML comments which are dealt with separately). + + + + + + + + + + + + + + + + + + + + + +
+ Byte Pattern + + + Pattern Mask + + + Leading Bytes to Be Ignored + + + Note + + + +
+ 3C 21 44 4F 43 54 59 50 45 20 48 54 4D 4C TT + + + FF FF DF DF DF DF DF DF DF FF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<!DOCTYPE HTML" + followed by a tag-terminating byte. + + +
+ 3C 48 54 4D 4C TT + + + FF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<HTML" followed by a + tag-terminating byte. + + +
+ 3C 48 45 41 44 TT + + + FF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<HEAD" followed by a + tag-terminating byte. + + +
+ 3C 53 43 52 49 50 54 TT + + + FF DF DF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<SCRIPT" followed by + a tag-terminating byte. + + +
+ 3C 49 46 52 41 4D 45 TT + + + FF DF DF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<IFRAME" followed by + a tag-terminating byte. + + +
+ 3C 48 31 TT + + + FF DF FF FF + + + Whitespace bytes. + + + The case-insensitive string "<H1" followed by a + tag-terminating byte. + + +
+ 3C 44 49 56 TT + + + FF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<DIV" followed by a + tag-terminating byte. + + +
+ 3C 46 4F 4E 54 TT + + + FF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<FONT" followed by a + tag-terminating byte. + + +
+ 3C 54 41 42 4C 45 TT + + + FF DF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<TABLE" followed by + a tag-terminating byte. + + +
+ 3C 41 TT + + + FF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<A" followed by a + tag-terminating byte. + + +
+ 3C 53 54 59 4C 45 TT + + + FF DF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<STYLE" followed by + a tag-terminating byte. + + +
+ 3C 54 49 54 4C 45 TT + + + FF DF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<TITLE" followed by + a tag-terminating byte. + + +
+ 3C 42 TT + + + FF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<B" followed by a + tag-terminating byte. + + +
+ 3C 42 4F 44 59 TT + + + FF DF DF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<BODY" followed by a + tag-terminating byte. + + +
+ 3C 42 52 TT + + + FF DF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<BR" followed by a + tag-terminating byte. + + +
+ 3C 50 TT + + + FF DF FF + + + Whitespace bytes. + + + The case-insensitive string "<P" followed by a + tag-terminating byte. +
+ + +

CORB confirmation sniffing for XML

+ +
    +
  1. Let patternMatched be the result of the pattern matching algorithm + given resource's resource header, the value in the first column of + row, the value in the second column of row, and the value in the third + column of row. + +
  2. If patternMatched is true, return "confirmed XML". + Otherwise return "possibly not XML". +
+ +

The table below is the text/xml-specific subset + of the table used for + identifying a resource with an unknown MIME type. + + + + + + +
+ Byte Pattern + + + Pattern Mask + + + Leading Bytes to Be Ignored + + + Note + +
+ 3C 3F 78 6D 6C + + + FF FF FF FF FF + + + Whitespace bytes. + + + The string "<?xml". +
+ + +

JSON security prefix sniffing

+ +
    +
  1. Let patternMatched be the result of the pattern matching algorithm + given resource's resource header, the value in the first column of + row, the value in the second column of row, and the value in the third + column of row. + +
  2. If patternMatched is true, return "JSON security prefix is present". + Otherwise return "no JSON security prefix". +
+ + + + + + + + + + + + +
+ Byte Pattern + + + Pattern Mask + + + Leading Bytes to Be Ignored + + + Note + +
+ 29 5D 7D 27 + + + FF FF FF FF + + + None. + + + + The string ")]}'". +

Parser breaker + built into angular.js (followed by a comma and a newline), + built into the Java Spring framework (followed by a comma and a space) + and observed on google.com (without a comma, followed by a newline). + +

+ 7B 7D 26 26 + + + FF FF FF FF + + + None. + + + The string "{}&&'". +

Parser breaker + used by Apache struts. + +

+ 7B 7D 20 26 26 + + + FF FF FF FF FF + + + None. + + + The string "{} &&'". +

Parser breaker + used by Spring framework (historically). + +

+ 66 6F 72 28 3B 3B 29 3B + + + FF FF FF FF FF FF FF FF + + + None. + + + The string "for(;;);". +

Infinite loop + observed on facebook.com + +

+ 66 6F 72 20 28 3B 3B 29 3B + + + FF FF FF FF FF FF FF FF FF + + + None. + + + The string "for (;;);". +

Infinite loop. + +

+ 77 68 69 6C 65 28 31 29 3B + + + FF FF FF FF FF FF FF FF FF + + + None. + + + The string "while(1);". +

Infinite loop. + +

+ 77 68 69 6C 65 20 28 31 29 3B + + + FF FF FF FF FF FF FF FF FF FF + + + None. + + + The string "while (1);". +

Infinite loop. + + +

+ +

Context-specific sniffing

From a4d8f1c6b9ee2539bafebe249ffed44ab466ebe8 Mon Sep 17 00:00:00 2001 From: Lukasz Anforowicz Date: Mon, 22 Oct 2018 15:25:41 -0700 Subject: [PATCH 2/3] =?UTF-8?q?Adding=20=C5=81ukasz=20Anforowicz=20to=20th?= =?UTF-8?q?e=20list=20of=20contributors?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- mimesniff.bs | 1 + 1 file changed, 1 insertion(+) diff --git a/mimesniff.bs b/mimesniff.bs index 43f74d9..435119c 100644 --- a/mimesniff.bs +++ b/mimesniff.bs @@ -3520,6 +3520,7 @@ user agents must use the following Jonathan Neal, Joshua Cranmer, Larry Masinter, + Łukasz Anforowicz, 罗泽轩, Mariko Kosaka, Mark Pilgrim, From f4a2d87981ff56048c40b73cedafaaa83fa512a7 Mon Sep 17 00:00:00 2001 From: Lukasz Anforowicz Date: Mon, 22 Oct 2018 15:40:22 -0700 Subject: [PATCH 3/3] Excluding text/css from JSON-security-prefix sniffing --- mimesniff.bs | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mimesniff.bs b/mimesniff.bs index 435119c..be6e228 100644 --- a/mimesniff.bs +++ b/mimesniff.bs @@ -2798,12 +2798,16 @@ user agents must use the following Abort these steps.

  • - Otherwise (if the no-sniff flag is not set and the computed MIME - type is not covered above) + If the computed MIME type is NOT "text/css", then the CORB confirmation sniffing result is "protected" if JSON security prefix sniffing algorithm returns "JSON security prefix is present". (and is "allowed" otherwise). + +

    "text/css" needs to be excluded, because + valid CSS may contain a JSON security prefix. See also + fetch/corb/style-css-with-json-parser-breaker.sub.html + in Web Platform Tests.