[fix] str.split can not handle surrogate pair, replaced with Array.from #395

luoxzhg · 2023-03-02T07:17:23Z

example

> Array.from("\n𝐀")
[ '\n', '𝐀' ]

> "\n𝐀".split('')
[ '\n', '\ud835', '\udc00' ]

ExplodingCabbage · 2023-12-15T16:28:52Z

Hmm, interesting. I fundamentally agree with the proposed change; operating on Unicode code points instead of UTF-16 code units is the correct default behaviour (and frankly no code should operate on UTF-16 code units unless it's a library explicitly for dealing with UTF-16). However, I don't want to merge any change that modifies the results that jsdiff emits without adding unit tests, adding release notes, and doing a major version number bump. I also want to carefully audit the code line by line to make sure there's nowhere else where we're similarly treating strings as sequences of UTF-16 code units instead of Unicode code points.

I want to churn through some of the more straightforward-to-handle issues and PRs before doing the above - but I do intend to return to this PR in due course!

ExplodingCabbage · 2024-03-08T16:47:53Z

Adding docs and stuff over at #500

[fix] str.split can not handle surrogate pair, replaced with Array.from

c79548e

ExplodingCabbage added the breaking-change label Dec 18, 2023

ExplodingCabbage mentioned this pull request Jan 10, 2024

Does jsdiff work with Chinese (or other non-English script languages)? #377

Closed

ExplodingCabbage mentioned this pull request Mar 8, 2024

Make diffChars diff Unicode code points instead of UTF-16 code units #500

Merged

ExplodingCabbage closed this Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] str.split can not handle surrogate pair, replaced with Array.from #395

[fix] str.split can not handle surrogate pair, replaced with Array.from #395

Uh oh!

luoxzhg commented Mar 2, 2023

Uh oh!

ExplodingCabbage commented Dec 15, 2023

Uh oh!

ExplodingCabbage commented Mar 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[fix] str.split can not handle surrogate pair, replaced with Array.from #395

[fix] str.split can not handle surrogate pair, replaced with Array.from #395

Uh oh!

Conversation

luoxzhg commented Mar 2, 2023

Uh oh!

ExplodingCabbage commented Dec 15, 2023

Uh oh!

ExplodingCabbage commented Mar 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants