Skip to content

Commit f460420

Browse files
committed
Copyediting chapter 9
1 parent 6092bef commit f460420

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

09_regexp.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -153,14 +153,14 @@ By a strange historical accident, `\s` (whitespace) does not have this problem a
153153

154154
{{index "character category", [Unicode, property]}}
155155

156-
It is possible to use `\p` in a regular expression to match all characters to which the Unicode standard assigns a given property. This allows us to match things like letters in a more cosmopolitan way. However, again due to compatibility with the original language standards, those are only recognized when you put a `u` character (for ((Unicode))) after the regular expression.
156+
It is possible to use `\p` in a regular expression to match all characters to which the Unicode standard assigns a given property. This allows us to match things like letters in a more cosmopolitan way. However, again due to compatibility with the original language standards, those are recognized only when you put a `u` character (for ((Unicode))) after the regular expression.
157157

158158
{{table {cols: [1, 5]}}}
159159

160160
| `\p{L}` | Any letter
161161
| `\p{N}` | Any numeric character
162162
| `\p{P}` | Any punctuation character
163-
| `\P{L}` | Any non-letter (uppercase P inverts)
163+
| `\P{L}` | Any nonletter (uppercase P inverts)
164164
| `\p{Script=Hangul}` | Any character from the given script (see [Chapter ?](higher_order#scripts))
165165

166166
Using `\w` for text processing that may need to handle non-English text (or even English text with borrowed words like “cliché”) is a liability, since it won't treat characters like “é” as letters. Though they tend to be a bit more verbose, `\p` property groups are more robust.
@@ -251,7 +251,7 @@ The first and second `+` characters apply only to the second `o` in `boo` and `h
251251

252252
{{index "case sensitivity", capitalization, ["regular expression", flags]}}
253253

254-
The `i` at the end of the expression in the example makes this regular expression case-insensitive, allowing it to match the uppercase _B_ in the input string, even though the pattern is itself all lowercase.
254+
The `i` at the end of the expression in the example makes this regular expression case insensitive, allowing it to match the uppercase _B_ in the input string, even though the pattern is itself all lowercase.
255255

256256
## Matches and groups
257257

@@ -359,7 +359,7 @@ If you give the `Date` constructor a single argument, that argument is treated a
359359

360360
{{index "getFullYear method", "getMonth method", "getDate method", "getHours method", "getMinutes method", "getSeconds method", "getYear method"}}
361361

362-
Date objects provide methods such as `getFullYear`, `getMonth`, `getDate`, `getHours`, `getMinutes`, and `getSeconds` to extract their components. Besides `getFullYear` there's also `getYear`, which gives you the year minus 1900 (`98` or `119`) and is mostly useless.
362+
Date objects provide methods such as `getFullYear`, `getMonth`, `getDate`, `getHours`, `getMinutes`, and `getSeconds` to extract their components. Besides `getFullYear` there's also `getYear`, which gives you the year minus 1900 (such as `98` or `125`) and is mostly useless.
363363

364364
{{index "capture group", "getDate method", [parentheses, "in regular expressions"]}}
365365

@@ -391,7 +391,7 @@ If we want to enforce that the match must span the whole string, we can add the
391391

392392
{{index "word boundary", "word character"}}
393393

394-
There is also a `\b` marker that matches _word boundaries_, positions that have a word character on one side, and a non-word character on the other. Unfortunately, these use the same simplistic concept of word characters as `\w`, and are therefore not very reliable.
394+
There is also a `\b` marker that matches _word boundaries_, positions that have a word character on one side, and a nonword character on the other. Unfortunately, these use the same simplistic concept of word characters as `\w` and are therefore not very reliable.
395395

396396
Note that these boundary markers don't match any actual characters. They just enforce that a given condition holds at the place where it appears in the pattern.
397397

@@ -406,7 +406,7 @@ console.log(/a(?! )/.exec("a b"));
406406
// → null
407407
```
408408

409-
The `e` in the first example is necessary to match, but is not part of the matched string. The `(?! )` notation expresses a _negative_ look-ahead. This only matches if the pattern in the parentheses _doesn't_ match, causing the second example to only match `a` characters that don't have a space after them.
409+
The `e` in the first example is necessary to match, but is not part of the matched string. The `(?! )` notation expresses a _negative_ look-ahead. This matches only if the pattern in the parentheses _doesn't_ match, causing the second example to match only `a` characters that don't have a space after them.
410410

411411
## Choice patterns
412412

@@ -538,7 +538,7 @@ console.log(stock.replace(/(\d+) (\p{L}+)/gu, minusOne));
538538

539539
This code takes a string, finds all occurrences of a number followed by an alphanumeric word, and returns a string that has one less of every such quantity.
540540

541-
The `(\d+)` group ends up as the `amount` argument to the function, and the `(\p{L}+)` group gets bound to `unit`. The function converts `amount` to a number—which always works since it matched `\d+` earlier—and makes some adjustments in case there is only one or zero left.
541+
The `(\d+)` group ends up as the `amount` argument to the function, and the `(\p{L}+)` group gets bound to `unit`. The function converts `amount` to a number—which always works, since it matched `\d+` earlier—and makes some adjustments in case there is only one or zero left.
542542

543543
## Greed
544544

@@ -597,7 +597,7 @@ console.log(regexp.test("Harry is a dodgy character."));
597597

598598
{{index ["regular expression", flags], ["backslash character", "in regular expressions"]}}
599599

600-
When creating the `\s` part of the string, we have to use two backslashes because we are writing them in a normal string, not a slash-enclosed regular expression. The second argument to the `RegExp` constructor contains the options for the regular expression—in this case, `"gi"` for global and case-insensitive.
600+
When creating the `\s` part of the string, we have to use two backslashes because we are writing them in a normal string, not a slash-enclosed regular expression. The second argument to the `RegExp` constructor contains the options for the regular expression—in this case, `"gi"` for global and case insensitive.
601601

602602
But what if the name is `"dea+hl[]rd"` because our user is a ((nerd))y teenager? That would result in a nonsensical regular expression that won't actually match the user's name.
603603

@@ -656,7 +656,7 @@ console.log(pattern.lastIndex);
656656

657657
{{index "side effect", "lastIndex property"}}
658658

659-
If the match was successful, the call to `exec` automatically updates the `lastIndex` property to point after the match. If no match was found, `lastIndex` is set back to zero, which is also the value it has in a newly constructed regular expression object.
659+
If the match was successful, the call to `exec` automatically updates the `lastIndex` property to point after the match. If no match was found, `lastIndex` is set back to 0, which is also the value it has in a newly constructed regular expression object.
660660

661661
The difference between the global and the sticky options is that when sticky is enabled, the match will succeed only if it starts directly at `lastIndex`, whereas with global, it will search ahead for a position where a match can start.
662662

@@ -794,11 +794,11 @@ The pattern `if (match = string.match(...))` makes use of the fact that the valu
794794

795795
{{index [parentheses, "in regular expressions"]}}
796796

797-
If a line is not a section header or a property, the function checks whether it is a comment or an empty line using the expression `/^\s*(;|$)/` to match lines that either contain only space, or space followed by a semicolon (making the rest of the line a comment). When a line doesn't match any of the expected forms, the function throws an exception.
797+
If a line is not a section header or a property, the function checks whether it is a comment or an empty line using the expression `/^\s*(;|$)/` to match lines that either contain only whitespace, or whitespace followed by a semicolon (making the rest of the line a comment). When a line doesn't match any of the expected forms, the function throws an exception.
798798

799799
## Code units and characters
800800

801-
Another design mistake that's been standardized in JavaScript regular expressions is that by default, operators like `.` or `?` work on code units, as discussed in [Chapter ?](higher_order#code_units), not actual characters. This means characters that are composed of two code units behave strangely.
801+
Another design mistake that's been standardized in JavaScript regular expressions is that by default, operators like `.` or `?` work on code units (as discussed in [Chapter ?](higher_order#code_units)), not actual characters. This means characters that are composed of two code units behave strangely.
802802

803803
```
804804
console.log(/🍎{3}/.test("🍎🍎🍎"));
@@ -858,7 +858,7 @@ Regular expressions are a sharp ((tool)) with an awkward handle. They simplify s
858858

859859
{{index debugging, bug}}
860860

861-
It is almost unavoidable that, in the course of working on these exercises, you will get confused and frustrated by some regular expression's inexplicable ((behavior)). Sometimes it helps to enter your expression into an online tool like [_debuggex.com_](https://www.debuggex.com/) to see whether its visualization corresponds to what you intended and to ((experiment)) with the way it responds to various input strings.
861+
It is almost unavoidable that, in the course of working on these exercises, you will get confused and frustrated by some regular expression's inexplicable ((behavior)). Sometimes it helps to enter your expression into an online tool like [_debuggex.com_](https://www.debuggex.com) to see whether its visualization corresponds to what you intended and to ((experiment)) with the way it responds to various input strings.
862862

863863
### Regexp golf
864864

0 commit comments

Comments
 (0)