Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
- [Input format](input-format.md)
- [Keywords](keywords.md)
- [Identifiers](identifiers.md)
- [Frontmatter](frontmatter.md)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not stable yet: rust-lang/rust#136889

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this under "Lexical structure" because this shebang is there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be sequenced just after "Input format" since it is tightly related to that? I realize the rest of these aren't in any particularly logical order, but I'm not quite seeing why it is placed in the middle here.

- [Comments](comments.md)
- [Whitespace](whitespace.md)
- [Tokens](tokens.md)
Expand Down
39 changes: 39 additions & 0 deletions src/frontmatter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
r[frontmatter]
# Frontmatter

r[frontmatter.syntax]
```grammar,lexer
@root FRONTMATTER ->
FRONTMATTER_FENCE HORIZONTAL_WHITESPACE* INFOSTRING? HORIZONTAL_WHITESPACE* LF
(FRONTMATTER_LINE LF )*
FRONTMATTER_FENCE[^matched-fence] HORIZONTAL_WHITESPACE* LF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unusual to have load-bearing footnotes in the grammar. Would it be possible to define this recursively so that isn't necessary? Or is there maybe some other way around it?

FRONTMATTER_FENCE -> `---` `-`*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FRONTMATTER_FENCE -> `---` `-`*
FRONTMATTER_FENCE -> `-`{3..}

INFOSTRING -> (XID_Start | `_`) ( XID_Continue | `-` | `.` )*
FRONTMATTER_LINE -> (~INVALID_FRONTMATTER_LINE_START (~INVALID_FRONTMATTER_LINE_CONTINUE)*)?
INVALID_FRONTMATTER_LINE_START -> (FRONTMATTER_FENCE[^escaped-fence] | LF)
INVALID_FRONTMATTER_LINE_CONTINUE -> LF
```

[^matched-fence]: The closing fence must have the same number of `-` as the opening fence
[^escaped-fence]: A `FRONTMATTER_FENCE` at the beginning of a `FRONTMATTER_LINE` is only invalid if it has the same or more `-` as the `FRONTMATTER_FENCE`

Frontmatter is an optional section for content intended for external tools without requiring these tools to have full knowledge of the Rust grammar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always use "intro" for the introduction of something.

Also, perhaps this intro could be moved to the top of the file?

Suggested change
Frontmatter is an optional section for content intended for external tools without requiring these tools to have full knowledge of the Rust grammar.
r[frontmatter.intro]
Frontmatter is an optional section for content intended for external tools without requiring these tools to have full knowledge of the Rust grammar.

Can you also include an example here?


r[frontmatter.document]
Frontmatter may only be preceded by a [shebang] and whitespace.

r[frontmatter.fence]
The delimiters are referred to as a "fence." The opening and closing fences must be at the start of a line. They must be a matching pair of three or more hyphens (`-`). A fence may be followed by horizontal whitespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically use emphasis for introducing terms.

Suggested change
The delimiters are referred to as a "fence." The opening and closing fences must be at the start of a line. They must be a matching pair of three or more hyphens (`-`). A fence may be followed by horizontal whitespace.
The delimiters are referred to as a *fence*. The opening and closing fences must be at the start of a line. They must be a matching pair of three or more hyphens (`-`). A fence may be followed by horizontal whitespace.


r[frontmatter.infostring]
Following the opening fence may be an infostring for identifying the intention of the contained content. An infostring may be followed by horizontal whitespace.

r[frontmatter.body]
The body of the frontmatter may contain any content except for a line starting with as many or more hyphens (`-`) than in the fences.

[shebang]: input-format.md#shebang-removal
6 changes: 6 additions & 0 deletions src/input-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ This prevents an [inner attribute] at the start of a source file being removed.
> [!NOTE]
> The standard library [`include!`] macro applies byte order mark removal, CRLF normalization, and shebang removal to the file it reads. The [`include_str!`] and [`include_bytes!`] macros do not.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the impact on these macros?


r[input.frontmatter]
## Frontmatter removal

After some whitespace, [frontmatter] may next appear in the input.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be worded more explicitly how it relates to the items around it (particularly shebang)? This rule isn't quite standing on its own and it isn't quite clear how it fits.

For example, something like: "after the optional [shebang] and then optional [whitespace], [frontmatter] may appear next in the input".


r[input.tokenization]
## Tokenization

Expand All @@ -69,4 +74,5 @@ The resulting sequence of characters is then converted into tokens as described
[comments]: comments.md
[Crates and source files]: crates-and-source-files.md
[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix)
[frontmatter]: frontmatter.md
[whitespace]: whitespace.md
43 changes: 26 additions & 17 deletions src/whitespace.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,32 @@ r[lex.whitespace]
r[whitespace.syntax]
```grammar,lexer
@root WHITESPACE ->
U+0009 // Horizontal tab, `'\t'`
| U+000A // Line feed, `'\n'`
| U+000B // Vertical tab
| U+000C // Form feed
| U+000D // Carriage return, `'\r'`
| U+0020 // Space, `' '`
| U+0085 // Next line
| U+200E // Left-to-right mark
| U+200F // Right-to-left mark
| U+2028 // Line separator
| U+2029 // Paragraph separator

TAB -> U+0009 // Horizontal tab, `'\t'`

LF -> U+000A // Line feed, `'\n'`

CR -> U+000D // Carriage return, `'\r'`
END_OF_LINE
| IGNORABLE_CODE_POINT
| HORIZONTAL_WHITESPACE

END_OF_LINE ->
U+000A // line feed, `'\n'`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please keep the original comments and same comment style in these rules?

| U+000B // vertical tabulation
| U+000C // form feed
| U+000D // carriage return, `'\r'`
| U+0085 // next line
| U+2028 // LINE SEPARATOR
| U+2029 // PARAGRAPH SEPARATOR

IGNORABLE_CODE_POINT ->
U+200E // LEFT-TO-RIGHT MARK
| U+200F // RIGHT-TO-LEFT MARK

HORIZONTAL_WHITESPACE ->
U+0009 // horizontal tab, `'\t'`
| U+0020 // space, `' '`

TAB -> U+0009 // horizontal tab, `'\t'`

LF -> U+000A // line feed, `'\n'`

CR -> U+000D // carriage return, `'\r'`
```

r[lex.whitespace.intro]
Expand Down