diff --git a/src/input-format.md b/src/input-format.md index 5d2a69275..cf35b2959 100644 --- a/src/input-format.md +++ b/src/input-format.md @@ -6,12 +6,6 @@ r[input.syntax] @root CHAR -> NUL -> U+0000 - -TAB -> U+0009 - -LF -> U+000A - -CR -> U+000D ``` r[input.intro] diff --git a/src/whitespace.md b/src/whitespace.md index 45d58b3fa..b398d0c95 100644 --- a/src/whitespace.md +++ b/src/whitespace.md @@ -1,21 +1,31 @@ r[lex.whitespace] # Whitespace +r[whitespace.syntax] +```grammar,lexer +@root WHITESPACE -> + U+0009 // Horizontal tab, `'\t'` + | U+000A // Line feed, `'\n'` + | U+000B // Vertical tab + | U+000C // Form feed + | U+000D // Carriage return, `'\r'` + | U+0020 // Space, `' '` + | U+0085 // Next line + | U+200E // Left-to-right mark + | U+200F // Right-to-left mark + | U+2028 // Line separator + | U+2029 // Paragraph separator + +TAB -> U+0009 // Horizontal tab, `'\t'` + +LF -> U+000A // Line feed, `'\n'` + +CR -> U+000D // Carriage return, `'\r'` +``` + r[lex.whitespace.intro] Whitespace is any non-empty string containing only characters that have the -[`Pattern_White_Space`] Unicode property, namely: - -- `U+0009` (horizontal tab, `'\t'`) -- `U+000A` (line feed, `'\n'`) -- `U+000B` (vertical tab) -- `U+000C` (form feed) -- `U+000D` (carriage return, `'\r'`) -- `U+0020` (space, `' '`) -- `U+0085` (next line) -- `U+200E` (left-to-right mark) -- `U+200F` (right-to-left mark) -- `U+2028` (line separator) -- `U+2029` (paragraph separator) +[`Pattern_White_Space`] Unicode property. r[lex.whitespace.token-sep] Rust is a "free-form" language, meaning that all forms of whitespace serve only