Skip to content
Prev Previous commit
Next Next commit
Add JSON_UNESCAPED_LINE_TERMINATORS
  • Loading branch information
sirreal committed May 9, 2024
commit 3e301171b2b54cb8cf137ba9e760cae9c0eddb33
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ public function print_client_interactivity_data() {
if ( ! empty( $interactivity_data ) ) {
$json_encode_flags = JSON_HEX_TAG | JSON_UNESCAPED_SLASHES;
if ( 'UTF-8' === get_option( 'blog_charset' ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sadly this check is insufficient, as there are common case variants and the hyphen may not be present. The common example is below, but I am so unnerved by this pattern that I'm currently prepping a patch to WordPress to add a new semantic check.

$charset = get_option( 'blog_charset' );
if ( in_array( $charset, array( 'utf8', 'utf-8', 'UTF8' ), true ) ) {
	$charset = 'UTF-8';
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It looks like the _canonical_charset() function (intended to normalize to "UTF-8") is applied as a filter by default, so maybe this is sufficient as is? Your proposed patch would make that filter more robust without any changes here:

add_filter( 'option_blog_charset', '_canonical_charset' );

If the encoding is UTF-8 and we fail to it, it will do little harm. The only issue is that valid unicode would be escaped to its \u1234 form.

$json_encode_flags |= JSON_UNESCAPED_UNICODE;
$json_encode_flags |= JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_LINE_TERMINATORS;
}

wp_print_inline_script_tag(
Expand Down