Skip to content
This repository was archived by the owner on Feb 18, 2025. It is now read-only.
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
also escape leading ascii letters
  • Loading branch information
bakkot authored and ljharb committed Apr 9, 2024
commit 27eee05522e932bc1718af608f2a8ee70eab5420
8 changes: 5 additions & 3 deletions spec.emu
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,11 @@ contributors:
1. Let _escaped_ be the empty String.
1. Let _cpList_ be StringToCodePoints(_S_).
1. For each code point _c_ in _cpList_, do
1. If _escaped_ is the empty String and _c_ is matched by |DecimalDigit|, then
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence.
1. Set _escaped_ to the string-concatenation of _escaped_, the code unit 0x005C (REVERSE SOLIDUS), *"x3"*, and the code unit whose numeric value is the numeric value of _c_.
1. If _escaped_ is the empty String, and _c_ is matched by |DecimalDigit| or |AsciiLetter|, then
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after `\c`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this sentence is too long to not use a parenthetical.

Suggested change
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after `\c`.
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text, which may be used after a `\0` character escape or a |DecimalEscape| such as `\1`, and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after `\c`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those commas read as ungrammatical to me. In particular the second comma cuts a clause in the middle - the pattern text "maybe used after \0 [...] and still match S". Open to other rephrasing here but I don't like this particular suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, TIL \c0 is a valid escape. This also prevents new RegExp("\\" + escape('n')) from combining.

1. Let _hex_ be Number::toString(𝔽(_c_), 16).
1. Assert: The length of _hex_ is 2.
1. Set _escaped_ to the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), *"x"*, and _hex_.
1. Else,
1. Set _escaped_ to the string-concatenation of _escaped_ and EncodeForRegExpEscape(_c_).
1. Return _escaped_.
Expand Down