-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linting the spec #173
Comments
This would need to be amended for AOs that are only called upstream, but at least it would force us to enumerate such AOs. |
As an additional step here, I would like for us to define a bunch of "pattern matchers" of some sort that identify idiomatic spec phrasing, and use a special rendering of the spec that highlights/styles these idiomatic sections of text so that we can identify algorithm steps and whatnot that (unexpectedly) do not conform. We might even want to track/relate the idiomatic phrasing in the same way we do AO references. |
I'll point out that ecmaspeak-py does almost all of what @bakkot listed, and more. Not that that helps ecmarkup a whole lot, but you're welcome to adapt my code. |
In addition to the things @bakkot listed, ... analyze_spec.py checks that...
Section.py checks that...
emu_grammars.py checks that...
Pseudocode.py ...
static_type_analysis.py...
|
I think you mean included, right? |
If we decide they should be universally included, sure. |
@bakkot I thought that's what we decided in the editor call. We even have it on our editor update slides. |
I only remembered that we were vaguely in support of it, not that we necessarily intended to actually change it. But if we did, sure, sounds good. |
I'm not good at remembering those sorts of things without notes. We can confirm at the next call. edit: In a later editor call, we did in fact confirm that universal inclusion is desired. |
|
Feel free to file bugs against |
Nesting more than one AO invocation as parameters to another AO should be disallowed. See tc39/ecma262#1573 (comment) |
To confirm, it seems like they should be disallowed only because the current macro doesn't handle them, not because they're inherently something we don't want? |
The issue there isn't about the macro, it's the fact that there's no evaluation order for the arguments defined, and when they're abstract operations they can have side effects which must be specified to happen in a specific order. |
|
In algorithm steps, each alias should only be introduced with |
There are other ways we introduce aliases. Other than those
Also, we occasionally refer to aliases from other algorithms. For example, the Function constructor says Also also, some places we have algorithms (especially but not uniquely in Annex B, e.g. in the Note at the end of Number.prototype.toExponential) we're actually defining replacements to existing steps in to other algorithms, in which case it is not possible to locally determine which aliases are in scope. |
Also, the things which are currently Edit: done in #229. |
We should figure out if we want to use html entities for non-ASCII stuff, and enforce that decision. Currently we use both |
Since I'm on a Mac, it's easy to type the actual characters, so I'd prefer that; but if others editing the spec aren't, it might not be as easy. |
It's 2020, embrace the Unicode. |
Why not use ASCII quotes and convert automatically in build time? Here is how
Instead of enforcing consistency, how about adding tags for this, e.g. 1. <emu-if><cond>x is 1 and y is 1</cond><then>return 1</then></emu-if>
1. <emu-else>return 0</emu-else> and 1. <emu-if-multiline><cond>_exponent_ is *+∞*<sub>𝔽</sub></cond><then>
1. <emu-if><cond>abs(ℝ(_base_)) > 1</cond><then>return *+∞*<sub>𝔽</sub></then></emu-if>
1. <emu-if><cond>abs(ℝ(_base_)) is 1</cond><then>return *NaN*</then></emu-if>
1. <emu-if><cond>abs(ℝ(_base_)) < 1</cond><then>return *+0*<sub>𝔽</sub></then></emu-if>
</then></emu-if-multiline>
1. <emu-else-multiline>...</emu-else-multiline> This would allow enforcing a consistent style, would also allow authors to not have to remember where commas/semicolons go, and would also let you change the style across the entire document anytime (after all |
The ecmarkdown page says:
Adding tags as you suggest would make algorithms harder to write + edit (and read, in source). |
When |
I would like to start enforcing some basic sanity checks for the spec. Some of them will end up implemented in other projects, but I would like to use this issue to track possible lint rules everywhere.
The main goal of this is to reduce editorial churn and correctness issues, but I know that it would also be helpful to people who build tooling based on the spec, like https://github.com/jscert/jsexplain (so cc @brabalan in case you have any ideas).
Currently these are mostly enforced by editors noticing them or @jmdyck submitting after-the-fact PRs, which isn't sustainable.
Here's a few possibilities to get started. Please suggest more.
No trailing whitespacefoo ( a, b )
in algorithm headers andfoo(a, b)
in algorithm steps._
If anIf
has substeps, theIf
line ends with, then
If
has anElse
which has substeps, it is spelledElse
, notOtherwise
If
has anelse
on the same line, it is spelled; else
rather than, else
or; else,
or; otherwise
(we are inconsistent aboutelse
vsotherwise
here, and I am OK not enforcing it, but we should at least enforce the semicolon and lack of following space).Lines end in.
,:
,,
,do
, orthen
Any line not ending in.
is followed by an indented sequences of steps.If _foo_ is present, let _bar_ be _foo_; else let _bar_ be *undefined*.
(Editorial: Treat not present parameters asundefined
ecma262#1411) [edit: actually that PR only applied to functions, not AOs, so we can only do this if we first sort out the treatment of missing parameters in AO e.g. by banning optional parameters in AOs)Return.
, since that's implicit (as of Editorial: describe behavior for algorithms without a "Return" ecma262#2397).[lookahead != `let`]
or[+Await]
rather than[ lookahead != `let` ]
or[ +Await ]
, etc):
on the LHSGrammar flags are consistent: every?
-prefixed flag on the RHS appears as a parameter flag on the LHS, and every LHS flag is passed down at least one nonterminal in one production on the RHS, or is used to gate a production (see more about what grammar flags mean here)tags are lowercaseauto-formatter #367attribute values are quoted (or unquoted; I don't care as long as we're consistent)<emu-xref="foo>"
should be caught, since they meant<emu-xref href="foo">
)formatter: render HTML entities #481≥
is spelled≥
(etc)no unknownlint: some checks for legal tags/attributes #279emu
tagssec-
is only a prefix for an ID when attached to a clause (cf Editorial: Don't use "sec-" prefix for dfn ids ecma262#2103)consistent indentationemu-grammar
andemu-alg
tags are not adjacent with others of their kindno more than one blank line in a rowand maybe consistent rules about where blank lines go, at least in some cases: e.g., never between<emu-clause>
and<h1>
.{ [[key]]: value, [[key2]]: value2 }
the *this* value
, not*this* value
orthe *this* object
British vs American spelling for words where it's an issue, like "behaviour""one's complement" and "two's complement", not "1's complement" or "2's complement"emu-prodref
s to all productions (or, ideally, is automatically generated)|
, as in|UnicodeLeadSurrogate|
.opt
subscripts and grammar parameters are not included.*+0*<sub>𝔽</sub>
or*-0*<sub>𝔽</sub>
, never*0*<sub>𝔽</sub>
. ~~(add some lint rules for numbers #257)*+1*
for any string of digits except0
.Exactly one space between sentences.<p>
is not followed by a linebreak, and</p>
is not preceded by a linebreak (even with intervening whitespace).[Cc]lauses? \d
should be forbidden.if
does not have athen
:If foo, return false.
notIf foo, then return false.
no unusedlint for used-but-not-declared and declared-but-not-used variables in algorithms #483Let
bindings or captured variables in AOs: Editorial: remove unused capture in %TypedArray%.prototype.sort ecma262#2836If _x_ is *null*, return *null*
rather thanIf _x_ is *null*, return _x_
, per Editorial: Prefer returning static values after alias comparison ecma262#3122 - anywhere there's a comparison against a thing in*
s or~
s, or a literal number or +/-∞, and then the alias is returned, we should return the constant instead.It would also be nice to have a few more static-analysis-y checks, like
They are invoked with the right number of argumentsThey don't reference any values which are not definedlint for used-but-not-declared and declared-but-not-used variables in algorithms #483set
rather thanlet
,increase
,increment
,add
, etc.Their return value is treated as a completion record or not as a completion record, as appropriate (pending Sorting out completion records ecma262#1796)*
vs~
vs_
vs"
rules for referring to different kinds of valuesIf _x_ is *true*
, notIf _x_ is true
.Grammar productions have all and only the appropriate flagsAll syntax-directed operations correspond to actual productionsEdit: some of these are done in #199, #205, #207, #209, #210, and #239. I've struck them from the above list. I'm keeping this issue open to track remaining items.
The text was updated successfully, but these errors were encountered: