Skip to content
Closed
Changes from 2 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
3652181
Fix assorted trivial capitalisation typos (#410)
ExplodingCabbage Dec 14, 2023
ff14775
Upgrade packages that Dependabot has open PRs about (#415)
ExplodingCabbage Dec 15, 2023
8ca7068
Bump karma from 5.1.1 to 6.3.16 (#357)
dependabot[bot] Dec 15, 2023
0e7a0cb
yarn eslint . --fix (#421)
ExplodingCabbage Dec 15, 2023
bbb2359
Update package.json version to 5.1.0 (#422)
ExplodingCabbage Dec 15, 2023
902e7da
Bump more dependencies to please Dependabot (#425)
ExplodingCabbage Dec 15, 2023
e67c2a0
Bump qs from 6.7.0 to 6.11.0 (#426)
dependabot[bot] Dec 15, 2023
a3759a1
Move demo link to the top of the README (#370)
waldyrious Dec 15, 2023
021c973
chore: update license file (#331)
eddiemonge Dec 16, 2023
87dedb6
readme: add links to section: change objects (#316)
milahu Dec 18, 2023
e0e960a
Document diffJson() options (#332)
cincodenada Dec 18, 2023
976d60d
Fix `exports` field in `package.json` (#351)
fisker Dec 18, 2023
1387be9
Remove index.html from master (#429)
ExplodingCabbage Dec 19, 2023
96f5998
Update comment in index.js to reflect JsDiff->Diff rename in 5.0.0 (#…
ExplodingCabbage Dec 19, 2023
b4d7491
Document in a comment in web_example.html that you need to run a buil…
ExplodingCabbage Dec 19, 2023
f2596ea
Fix a typo (#433)
ExplodingCabbage Dec 20, 2023
6183041
Default value of line delimiters when a patch is applied (#228) (#393)
Cinedin Dec 20, 2023
2e08d01
Fix bug that leads to worse time complexity and cripplingly slow perf…
ExplodingCabbage Dec 23, 2023
5897c8f
Update CONTRIBUTING.md to use yarn (#441)
ExplodingCabbage Dec 23, 2023
a19bded
Fix typo / grammar error in CONTRIBUTING.md (#442)
ExplodingCabbage Dec 23, 2023
14bfcb6
Update release-notes.md with content on npm that never got pushed to …
ExplodingCabbage Dec 23, 2023
8e51326
Write release notes for PRs already merged to master (#444)
ExplodingCabbage Dec 23, 2023
573b7af
Option to strip trailing CR (#344)
oBusk Dec 27, 2023
e35c347
Add release notes for @oBusk's PR #344 (#445)
ExplodingCabbage Dec 27, 2023
a4eac49
Stop treating stuff like vertical tabs as line breaks when dealing wi…
ExplodingCabbage Dec 27, 2023
a2dc5ec
Add test showing patch from bug #177 is handled correctly now (#447)
ExplodingCabbage Dec 27, 2023
bf5ec4a
Flip core algorithm so everything is no longer the mirror image of My…
ExplodingCabbage Dec 27, 2023
97c676d
Merge remote-tracking branch 'origin/master' into 6.0.0-staging
ExplodingCabbage Dec 27, 2023
fe261ae
Prefer to order deletions before insertions when the edit cost is the…
ExplodingCabbage Dec 27, 2023
b1b2035
Speed up algorithm by not considering diagonals that take us off the …
ExplodingCabbage Dec 29, 2023
8bd13d6
Consistently capitalize "jsdiff" in all-lowercase in docs (#449)
ExplodingCabbage Dec 29, 2023
56c6a8a
Expose `formatPatch` on `diff` object and document (#451)
ExplodingCabbage Dec 29, 2023
6a574cc
Merge branch 'master' into 6.0.0-staging
ExplodingCabbage Dec 29, 2023
8365367
Add function to reverse a patch (#450)
ExplodingCabbage Jan 2, 2024
3351c82
Merge remote-tracking branch 'origin/master' into 6.0.0-staging
ExplodingCabbage Jan 2, 2024
3a99253
Always set `added` and `removed` to `true` or `false`, rather than le…
ExplodingCabbage Jan 2, 2024
a98b974
Flesh out the README a bit and fix some errors and omissions (#458)
ExplodingCabbage Jan 4, 2024
c8c5132
Merge branch 'master' into 6.0.0-staging
ExplodingCabbage Jan 4, 2024
c6498e3
Document that applyPatch can return false (#459)
ExplodingCabbage Jan 4, 2024
7eacf2a
Merge branch 'master' into 6.0.0-staging
ExplodingCabbage Jan 5, 2024
b3aab68
Handle case where the user explicitly passes `maxEditLength: 0` the w…
ExplodingCabbage Jan 8, 2024
ea983ba
Fix more gaps in the docs (#466)
ExplodingCabbage Jan 8, 2024
12e092d
Merge remote-tracking branch 'origin/master' into 6.0.0-staging
ExplodingCabbage Jan 8, 2024
e6c45b0
Add a oneChangePerToken option (#460)
ExplodingCabbage Jan 8, 2024
1e79116
Fix order of arguments to .equals and comparator (#467)
ExplodingCabbage Jan 8, 2024
25a14af
Migrate to DABH's fork of colors (#469)
ExplodingCabbage Jan 8, 2024
ca8718c
Bump follow-redirects from 1.14.8 to 1.15.4 (#470)
dependabot[bot] Jan 9, 2024
1c7514c
Fix mistake in README (#471)
ExplodingCabbage Jan 10, 2024
707fccc
Add note to README about setting `context` to Infinity or MAX_SAFE_IN…
ExplodingCabbage Jan 10, 2024
1f1ec96
Replace broken link to Myers's paper in the README with a working one…
ExplodingCabbage Jan 11, 2024
533893d
Add `timeout` option (#478)
ExplodingCabbage Jan 26, 2024
b5d1cfa
Modify node_example.js to support showing added/deleted spaces (#479)
ExplodingCabbage Jan 26, 2024
06a669b
Merge branch 'master' into 6.0.0-staging
ExplodingCabbage Jan 29, 2024
4abb5f3
Support max edit length in patch creation functions (#480)
ExplodingCabbage Jan 29, 2024
dfc6fe4
Add examples to docs of creating and applying patches (importantly in…
ExplodingCabbage Jan 29, 2024
a2f726a
Add myself to the list of maintainers (#482)
ExplodingCabbage Feb 12, 2024
370a9df
5.2.0 release (#483)
ExplodingCabbage Feb 12, 2024
ad635b1
Add a reminder to the releasing docs to update the gh-pages site afte…
ExplodingCabbage Feb 13, 2024
b9f56d3
Merge branch 'master' into 6.0.0-staging
ExplodingCabbage Feb 13, 2024
fc2e36d
Merge pull request #446 from kpdecker/6.0.0-staging
ExplodingCabbage Feb 13, 2024
5f9cd41
Sort out behaviour of newlineIsToken and ignoreWhitespace (#486)
ExplodingCabbage Feb 15, 2024
e83674b
Remove failing test (#487)
ExplodingCabbage Feb 15, 2024
a3e4812
Add some more exhaustive tests based on @Mingun's work (#488)
ExplodingCabbage Feb 15, 2024
13d9749
Fix the weird function signature of async callbacks (#490)
ExplodingCabbage Feb 15, 2024
bf45b03
Fix race conditions involving this.options being overwritten during e…
ExplodingCabbage Feb 15, 2024
a73b771
Add further assertion to test, as suggested by Mingun (#491)
ExplodingCabbage Feb 16, 2024
f38e47d
Support `Object.create(null)` in JSON diffing (#493)
danbeam Feb 19, 2024
3da78c2
Simplify tokenization logic in diffWords (#494)
ExplodingCabbage Feb 19, 2024
7a73dc1
Bump ip from 1.1.5 to 1.1.9 (#495)
dependabot[bot] Feb 21, 2024
f925d4c
Fix trivial typo ("threat"->"treat" in test name) (#498)
ExplodingCabbage Mar 5, 2024
045c346
Add test of how diffWordsWithSpace handles Windows-style newlines (#499)
ExplodingCabbage Mar 6, 2024
490f5ab
Make diffChars diff Unicode code points instead of UTF-16 code units …
ExplodingCabbage Mar 8, 2024
59161e0
Fix diffWords handling of whitespace (#497)
ExplodingCabbage Mar 11, 2024
f4f11df
Bump follow-redirects from 1.15.4 to 1.15.6 (#502)
dependabot[bot] Mar 18, 2024
c9bc8e3
Run: (#503)
ExplodingCabbage Mar 19, 2024
16a060e
Purge inactive/broken Travis and Sauce Labs integrations (#504)
ExplodingCabbage Mar 19, 2024
84b5c9e
Upgrade some more dev dependencies (#505)
ExplodingCabbage Mar 20, 2024
b90a3eb
Bump Mocha one major version (#507)
ExplodingCabbage Mar 20, 2024
64f587c
Always enable "strict mode" in parsePatch (#508)
ExplodingCabbage Mar 20, 2024
53339e2
Remove unused Grunt `version` task; flesh out docs on how to do a rel…
ExplodingCabbage Mar 20, 2024
eb73eb8
Remove style.css from master branch (it's part of the gh-pages site, …
ExplodingCabbage Mar 20, 2024
046b5d3
Bump webpack-dev-middleware from 7.0.0 to 7.1.1 (#511)
dependabot[bot] Mar 22, 2024
4ebc4bf
Bump express from 4.18.3 to 4.19.2 (#512)
dependabot[bot] Apr 28, 2024
c8a9cc5
Remove linedelimiters, improve handling of Windows vs Unix line endin…
ExplodingCabbage Jun 7, 2024
9bb34dc
Support `callback` in patch functions, not just diffFoo functions (#521)
ExplodingCabbage Jun 7, 2024
0126325
Add tests of existing broken parsePatch behaviour, to fix before next…
ExplodingCabbage Jun 14, 2024
5ecab06
Bump braces from 3.0.2 to 3.0.3 (#526)
dependabot[bot] Jun 18, 2024
323e8bb
Fix parse patch bug (#529)
ExplodingCabbage Jun 24, 2024
896c982
Fix release notes typo
ExplodingCabbage Jun 24, 2024
0e7c20c
Add ignoreNewlineAtEof (#530)
ExplodingCabbage Jun 24, 2024
353d117
Rewrite applyPatch (#533)
ExplodingCabbage Jul 26, 2024
939bb45
Fix handling of EOF in createPatch (#535)
ExplodingCabbage Jul 29, 2024
244df82
Fix more logic around newlines at EOF - this time stuff I recently br…
ExplodingCabbage Jul 29, 2024
4f0430a
Add Intl.Segmenter support (#539)
ExplodingCabbage Aug 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 64 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,88 +13,103 @@ Based on the algorithm proposed in
npm install diff --save
```

## API
## Usage

* `Diff.diffChars(oldStr, newStr[, options])` - diffs two blocks of text, comparing character by character.
Broadly, jsdiff's diff functions all take an old text and a new text and perform three steps:

Returns a list of [change objects](#change-objects).
1. Split both texts into arrays of "tokens". What constitutes a token varies; in `diffChars`, each character is a token, while in `diffLines`, each line is a token.

Options
* `ignoreCase`: `true` to ignore casing difference. Defaults to `false`.
2. Find the smallest set of single-token *insertions* and *deletions* needed to transform the first array of tokens into the second.

This step depends upon having some notion of a token from the old array being "equal" to one from the new array, and this notion of equality affects the results. Usually two tokens are equal if `===` considers them equal, but some of the diff functions use an alternative notion of equality or have options to configure it. For instance, by default `diffChars("Foo", "FOOD")` will require two deletions (`o`, `o`) and three insertions (`O`, `O`, `D`), but `diffChars("Foo", "FOOD", {ignoreCase: true})` will require just one insertion (of a `D`), since `ignoreCase` causes `o` and `O` to be considered equal.

3. Return an array representing the transformation computed in the previous step as a series of [change objects](#change-objects). The array is ordered from the start of the input to the end, and each change object represents *inserting* one or more tokens, *deleting* one or more tokens, or *keeping* one or more tokens.

### API

* `Diff.diffWords(oldStr, newStr[, options])` - diffs two blocks of text, comparing word by word, ignoring whitespace.
* `Diff.diffChars(oldStr, newStr[, options])` - diffs two blocks of text, treating each character as a token.

Returns a list of [change objects](#change-objects).

Options
* `ignoreCase`: Same as in `diffChars`.
* `ignoreCase`: If `true`, the uppercase and lowercase forms of a character are considered equal. Defaults to `false`.

* `Diff.diffWordsWithSpace(oldStr, newStr[, options])` - diffs two blocks of text, comparing word by word, treating whitespace as significant.
* `Diff.diffWords(oldStr, newStr[, options])` - diffs two blocks of text, treating each word and each word separator (punctuation, newline, or run of whitespace) as a token.

(Whitespace-only tokens are automatically treated as equal to each other, so changes like changing a space to a newline or a run of multiple spaces will be ignored.)

Returns a list of [change objects](#change-objects).

* `Diff.diffLines(oldStr, newStr[, options])` - diffs two blocks of text, comparing line by line.
Options
* `ignoreCase`: Same as in `diffChars`. Defaults to false.

* `Diff.diffWordsWithSpace(oldStr, newStr[, options])` - same as `diffWords`, except whitespace-only tokens are not automatically considered equal, so e.g. changing a space to a tab is considered a change.

* `Diff.diffLines(oldStr, newStr[, options])` - diffs two blocks of text, treating each line as a token.

Options
* `ignoreWhitespace`: `true` to ignore leading and trailing whitespace. This is the same as `diffTrimmedLines`
* `stripTrailingCr`: `true` to remove all trailing CR (`\r`) characters before perfoming the diff.
* `ignoreWhitespace`: `true` to strip all leading and trailing whitespace characters from each line before performing the diff. Defaults to `false`.
* `stripTrailingCr`: `true` to remove all trailing CR (`\r`) characters before performing the diff. Defaults to `false`.
This helps to get a useful diff when diffing UNIX text files against Windows text files.
* `newlineIsToken`: `true` to treat newline characters as separate tokens. This allows for changes to the newline structure to occur independently of the line content and to be treated as such. In general this is the more human friendly form of `diffLines` and `diffLines` is better suited for patches and other computer friendly output.
* `newlineIsToken`: `true` to treat the newline character at the end of each line as its own token. This allows for changes to the newline structure to occur independently of the line content and to be treated as such. In general this is the more human friendly form of `diffLines`; the default behavior with this option turned off is better suited for patches and other computer friendly output. Defaults to `false`.

Returns a list of [change objects](#change-objects).

* `Diff.diffTrimmedLines(oldStr, newStr[, options])` - diffs two blocks of text, comparing line by line, ignoring leading and trailing whitespace.
* `Diff.diffTrimmedLines(oldStr, newStr[, options])` - diffs two blocks of text, comparing line by line, after stripping leading and trailing whitespace. Equivalent to calling `diffLines` with `ignoreWhitespace: true`.

Options
* `stripTrailingCr`: Same as in `diffLines`. Defaults to `false`.
* `newlineIsToken`: Same as in `diffLines`. Defaults to `false`.

Returns a list of [change objects](#change-objects).

* `Diff.diffSentences(oldStr, newStr[, options])` - diffs two blocks of text, comparing sentence by sentence.
* `Diff.diffSentences(oldStr, newStr[, options])` - diffs two blocks of text, treating each sentence as a token.

Returns a list of [change objects](#change-objects).

* `Diff.diffCss(oldStr, newStr[, options])` - diffs two blocks of text, comparing CSS tokens.

Returns a list of [change objects](#change-objects).

* `Diff.diffJson(oldObj, newObj[, options])` - diffs two JSON objects, comparing the fields defined on each. The order of fields, etc does not matter in this comparison.
* `Diff.diffJson(oldObj, newObj[, options])` - diffs two JSON-serializable objects by first serializing them to prettily-formatted JSON and then treating each line of the JSON as a token. Object properties are ordered alphabetically in the serialized JSON, so the order of properties in the objects being compared doesn't affect the result.

Returns a list of [change objects](#change-objects).

Options
* `stringifyReplacer`: A custom replacer function. Operates similarly to the `replacer` parameter to [`JSON.stringify()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#the_replacer_parameter), but must be a function.
* `undefinedReplacement`: A value to replace `undefined` with. Ignored if a `stringifyReplacer` is provided.

* `Diff.diffArrays(oldArr, newArr[, options])` - diffs two arrays, comparing each item for strict equality (===).
* `Diff.diffArrays(oldArr, newArr[, options])` - diffs two arrays of tokens, comparing each item for strict equality (===).

Options
* `comparator`: `function(left, right)` for custom equality checks

Returns a list of [change objects](#change-objects).

* `Diff.createTwoFilesPatch(oldFileName, newFileName, oldStr, newStr, oldHeader, newHeader)` - creates a unified diff patch.
* `Diff.createTwoFilesPatch(oldFileName, newFileName, oldStr, newStr[, oldHeader[, newHeader[, options]]])` - creates a unified diff patch by first computing a diff with `diffLines` and then serializing it to unified diff format.

Parameters:
* `oldFileName` : String to be output in the filename section of the patch for the removals
* `newFileName` : String to be output in the filename section of the patch for the additions
* `oldStr` : Original string value
* `newStr` : New string value
* `oldHeader` : Additional information to include in the old file header
* `newHeader` : Additional information to include in the new file header
* `oldHeader` : Optional additional information to include in the old file header. Default: `undefined`.
* `newHeader` : Optional additional information to include in the new file header. Default: `undefined`.
* `options` : An object with options.
- `context` describes how many lines of context should be included.
- `ignoreWhitespace`: `true` to ignore leading and trailing whitespace.
- `newlineIsToken`: `true` to treat newline characters as separate tokens. This allows for changes to the newline structure to occur independently of the line content and to be treated as such. In general this is the more human friendly form of `diffLines` and `diffLines` is better suited for patches and other computer friendly output.
- `stripTrailingCr`: `true` to remove all trailing CR (`\r`) characters before perfoming the diff.
This helps to get a useful diff when diffing UNIX text files against Windows text files.
- `ignoreWhitespace`: Same as in `diffLines`. Defaults to `false`.
- `stripTrailingCr`: Same as in `diffLines`. Defaults to `false`.
- `newlineIsToken`: Same as in `diffLines`. Defaults to `false`.

* `Diff.createPatch(fileName, oldStr, newStr, oldHeader, newHeader)` - creates a unified diff patch.
* `Diff.createPatch(fileName, oldStr, newStr[, oldHeader[, newHeader]])` - creates a unified diff patch.

Just like Diff.createTwoFilesPatch, but with oldFileName being equal to newFileName.

* `Diff.formatPatch(patch)` - creates a unified diff patch.

`patch` may be either a single structured patch object (as returned by `structuredPatch`) or an array of them (as returned by `parsePatch`).

* `Diff.structuredPatch(oldFileName, newFileName, oldStr, newStr, oldHeader, newHeader, options)` - returns an object with an array of hunk objects.
* `Diff.structuredPatch(oldFileName, newFileName, oldStr, newStr[, oldHeader[, newHeader[, options]]])` - returns an object with an array of hunk objects.

This method is similar to createTwoFilesPatch, but returns a data structure
suitable for further processing. Parameters are the same as createTwoFilesPatch. The data structure returned may look like this:
Expand All @@ -121,6 +136,8 @@ npm install diff --save

* `Diff.applyPatches(patch, options)` - applies one or more patches.

`patch` may be either an array of structured patch objects, or a string representing a patch in unified diff format (which may patch one or more files).

This method will iterate over the contents of the patch and apply to data provided through callbacks. The general flow for each patch index is:

- `options.loadFile(index, callback)` is called. The caller should then load the contents of the file and then pass that to the `callback(err, data)` callback. Passing an `err` will terminate further patch execution.
Expand All @@ -136,17 +153,37 @@ npm install diff --save

`patch` may be either a single structured patch object (as returned by `structuredPatch`) or an array of them (as returned by `parsePatch`).

* `convertChangesToXML(changes)` - converts a list of changes to a serialized XML format
* `Diff.convertChangesToXML(changes)` - converts a list of change objects to a serialized XML format

* `Diff.convertChangesToDMP(changes` - converts a list of change objects to the format returned by Google's [diff-match-patch](https://github.com/google/diff-match-patch) library

All methods above which accept the optional `callback` method will run in sync mode when that parameter is omitted and in async mode when supplied. This allows for larger diffs without blocking the event loop. This may be passed either directly as the final parameter or as the `callback` field in the `options` object.

### Defining custom diffing behaviors

If you need behavior a little different to what any of the text diffing functions above offer, you can roll your own by customizing both the tokenization behavior used and the notion of equality used to determine if two characters are equal.

The simplest way to customize tokenization behavior is to simply tokenize the texts you want to diff yourself, with your own code, then pass the arrays of tokens to `diffArrays`. For instance, if you wanted a semantically-aware diff of some code, you could try tokenizing it using a parser specific to the programming language the code is in, then passing the arrays of tokens to `diffArrays`.

To customize the notion of token equality used, use the `comparator` option to `diffArrays`.

For even more customisation of the diffing behavior, you can create a `new Diff.Diff()` object, overwrite its `castInput`, `tokenize`, `removeEmpty`, `equals`, and `join` properties with your own functions, then call its `diff(oldString, newString[, options])` method. The methods you can overwrite are used as follows:

* `castInput(value)`: used to transform the `oldString` and `newString` before any other steps in the diffing algorithm happen. For instance, `diffJson` uses `castInput` to serialize the objects being diffed to JSON. Defaults to a no-op.
* `tokenize(value)`: used to convert each of `oldString` and `newString` (after they've gone through `castInput`) to an array of tokens. Defaults to returning `value.split('')` (returning an array of individual characters).
* `removeEmpty(array)`: called on the arrays of tokens returned by `tokenize` and can be used to modify them. Defaults to stripping out falsey tokens, such as empty strings. `diffArrays` overrides this to simply return the `array`, which means that falsey values like empty strings can be handled like any other token by `diffArrays`.
* `equals(left, right)`: called to determine if two tokens (one from the old string, one from the new string) should be considered equal. Defaults to comparing them with `===`.
* `join(tokens)`: gets called with an array of consecutive tokens that have either all been added, all been removed, or are all common. Needs to join them into a single value that can be used as the `value` property of the [change object](#change-objects) for these tokens. Defaults to simply returning `tokens.join('')`.

### Change Objects
Many of the methods above return change objects. These objects consist of the following fields:

* `value`: Text content
* `value`: The concatenated content of all the tokens represented by this change object - i.e. generally the text that is either added, deleted, or common, as a single string. In cases where tokens are considered common but are non-identical (e.g. because an option like `ignoreCase` or a custom `comparator` was used), the value from the *new* string will be provided here.
* `added`: true if the value was inserted into the new string, otherwise false
* `removed`: true if the value was removed from the old string, otherwise false
* `count`: How many tokens (e.g. chars for `diffChars`, lines for `diffLines`) the value in the change object consists of

(Change objects where `added` and `removed` are both false represent content that is common to the old and new strings.)

## Examples

Expand Down