You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a number of escape sequences that the parser mistakenly accepts or miscompiles.
ECMAScript
Backreferences with leading zero digits (e.g., \01 for capture group 1) should be rejected. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"]
\00 and more zero digits should be rejected and not be interpreted as an escape for NUL. Only \0 is a valid escape sequence for NUL. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"]
When a custom traits implementation defines a new character class "z", [\z] matches the characters in this class and not the character z. (Meanwhile, \z without brackets matches the character z and not the characters in the class "z".) [ECMA-262 3rd ed., Sections 15.10.1 "Patterns" and 15.10.2.12 "CharacterClassEscape"]
[\b] should match U+0008 BACKSPACE, not b. [ECMA-262 3rd ed., Section 15.10.2.19 "ClassEscape"]
Octal escape sequences are not parsed correctly in square-bracket character class definitions. (E.g., [\040] should match U+0020 SPACE.)
Similarly, [\"] and [\/] match backslashes as well even though they shouldn't.
While the awk specification says that using unspecified escape sequences results in undefined behavior, I think we should reject them. (I believe we should handle this differently from ECMAScript mode, where unrecognized escape sequences just yield the escaped character.)
The text was updated successfully, but these errors were encountered:
There are a number of escape sequences that the parser mistakenly accepts or miscompiles.
ECMAScript
\01
for capture group 1) should be rejected. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"]\00
and more zero digits should be rejected and not be interpreted as an escape for NUL. Only\0
is a valid escape sequence for NUL. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"][\z]
matches the characters in this class and not the character z. (Meanwhile,\z
without brackets matches the character z and not the characters in the class "z".) [ECMA-262 3rd ed., Sections 15.10.1 "Patterns" and 15.10.2.12 "CharacterClassEscape"][\b]
should match U+0008 BACKSPACE, not b. [ECMA-262 3rd ed., Section 15.10.2.19 "ClassEscape"]awk
See Section "Regular expressions" in the awk specification.
[\040]
should match U+0020 SPACE.)[\"]
and[\/]
match backslashes as well even though they shouldn't.The text was updated successfully, but these errors were encountered: