Skip to content

Conversation

@sirreal
Copy link
Member

@sirreal sirreal commented Jan 20, 2025

Fixes #62797 by correctly escaping JSON included inside JavaScript in a script tag.

wp_add_inline_script will correctly escape closing script tags </script> which prevents script contents from breaking out of the script, it does not handle cases where some content may enter the HTML script data double escaped state and prevent the real script closing tag from closing the script tag as expected, leading to a broken document where the script tag remains open.

This can be tested by applying the patch and following the reproduction steps in #62797.

Trac ticket: https://core.trac.wordpress.org/ticket/62797


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@sirreal sirreal marked this pull request as ready for review January 20, 2025 10:56
@github-actions
Copy link

github-actions bot commented Jan 20, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, bernhard-reiter, dmsnell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal sirreal force-pushed the fix/62797-api-fetch-preload-json-encode branch from 4089bdc to 7defd55 Compare May 30, 2025 08:42
@sirreal sirreal marked this pull request as draft May 30, 2025 11:02
@sirreal
Copy link
Member Author

sirreal commented Jun 6, 2025

I plan to add tests to this using assertEqualMarkup when it lands in #8882.

@sirreal
Copy link
Member Author

sirreal commented Jun 6, 2025

I'm concerned this pattern is very common. I wonder if there should be some dedicated functions.

Here are some examples of this exact pattern in the site editor that are susceptible to the same problem:

wp_add_inline_script(
'wp-edit-site',
sprintf(
'wp.domReady( function() {
wp.editSite.initializeEditor( "site-editor", %s );
} );',
wp_json_encode( $editor_settings )
)
);
// Preload server-registered block schemas.
wp_add_inline_script(
'wp-blocks',
'wp.blocks.unstable__bootstrapServerSideBlockDefinitions(' . wp_json_encode( get_block_editor_server_block_settings() ) . ');'
);
// Preload server-registered block bindings sources.
$registered_sources = get_all_registered_block_bindings_sources();
if ( ! empty( $registered_sources ) ) {
$filtered_sources = array();
foreach ( $registered_sources as $source ) {
$filtered_sources[] = array(
'name' => $source->name,
'label' => $source->label,
'usesContext' => $source->uses_context,
);
}
$script = sprintf( 'for ( const source of %s ) { wp.blocks.registerBlockBindingsSource( source ); }', wp_json_encode( $filtered_sources ) );
wp_add_inline_script(
'wp-blocks',
$script
);
}
wp_add_inline_script(
'wp-blocks',
sprintf( 'wp.blocks.setCategories( %s );', wp_json_encode( isset( $editor_settings['blockCategories'] ) ? $editor_settings['blockCategories'] : array() ) ),
'after'
);

@dmsnell
Copy link
Member

dmsnell commented Jun 6, 2025

I'm concerned this pattern is very common. I wonder if there should be some dedicated functions.

What’s your concern? that it will contain </script> tag closers?

@sirreal
Copy link
Member Author

sirreal commented Jun 9, 2025

What’s your concern? that it will contain </script> tag closers?

That's the problem here that may produce a broken page. I'm unaware of others, but wp_add_inline_script + wp_json_encode seem common to use together but risky with the default arguments.

@sirreal sirreal force-pushed the fix/62797-api-fetch-preload-json-encode branch 3 times, most recently from 858012e to 83f2c7e Compare June 11, 2025 13:39
@sirreal
Copy link
Member Author

sirreal commented Jun 11, 2025

I've pushed a test leveraging new assertEqualHTML from #8882 with this fix reverted. That test will fail on CI, then I'll push the fix again.

The failure appears as Error: Paused at incomplete token. because the <script> tag remains unclosed.

@sirreal
Copy link
Member Author

sirreal commented Jun 11, 2025

Example failures on CI can be observed here:

1) Tests_Blocks_Editor::test_preload_closes_script_tag
Error: Paused at incomplete token.

(I'm updating the test name after the CI run, but the result is the same).

@sirreal sirreal requested review from dmsnell and ockham June 11, 2025 13:52
@sirreal sirreal marked this pull request as ready for review June 11, 2025 13:52
@dmsnell
Copy link
Member

dmsnell commented Jun 17, 2025

That's the problem here that may produce a broken page. I'm unaware of others, but wp_add_inline_script + wp_json_encode seem common to use together but risky with the default arguments.

Could we perhaps create a new Issue or document where we can start enumerating the situations in which this can go bad? perhaps a new test suite with no new code, in a PR.

I find that it’s hard for me to grasp unless it’s a particularly good day and I have my full concentration.

There are differences, aren’t there, between printing these three situations?

<script type="application/json">
{ "some": "json" }
</script>
<script>
const data = { "some": "json" };
</script>
<script>
callMeMaybe( { "some": "json" } );
</script>

I thought these were distinct cases with different edges but I can’t remember how.

@sirreal sirreal force-pushed the fix/62797-api-fetch-preload-json-encode branch 2 times, most recently from 4322bb0 to 85bccee Compare July 22, 2025 16:38
@sirreal
Copy link
Member Author

sirreal commented Jul 23, 2025

Could we perhaps create a new Issue or document where we can start enumerating the situations in which this can go bad? perhaps a new test suite with no new code, in a PR.

Will you describe the goal of that? Is this to find some a more general fix or explore alternatives?

There are differences, aren’t there, between printing these three situations?

I don't think there are differences today.

I did a deep dive and there used to be a difference. It's the JavaScript is a superset of JSON proposal that was addressed in es2019. Platform support seems ubiquitous.

Note that this is only relevant when using JSON_UNESCAPED_UNICODE (and JSON_UNESCAPED_LINE_TERMINATORS in PHP>=7.1), otherwise the relevant characters are always escaped.

More details and demo

In short, there were situations where serializing JSON (to read as JSON) and serializing JSON (inside of a JavaScript context) needed to be treated slightly differently.

In browsers without es2019 JavaScript is a superset of JSON support (I tried Chrome 65, the latest Chrome without support) and if there are strings with U+2028 or U+2029, JSON allows those characters in the string, but JavaScript does not.

This demo shows the differences (only in older browsers, e.g. Chrome <= 65).

This code will error with SyntaxError: Invalid or unexpected token because of the unescaped U+2028 and U+2029 that were invalid in JavaScript strings prior to es2019:

<body>
<script defer>
const x = <?php echo json_encode(
        [ 'test' => "x\u{2028}\u{2029}y\n" ],
        JSON_UNESCAPED_SLASHES | JSON_HEX_TAG | JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_LINE_TERMINATORS
); ?>;
document.body.textContent = [...x.test].map(c => c.toString(16) ).join('');
</script>

This, however, is fine because the characters are not evaluated as JavaScript. They're read as JSON text, where they've always been allowed:

<body>
<script type="application/json" id="x">
<?php echo json_encode(
        [ 'test' => "x\u{2028}\u{2029}y\n" ],
        JSON_UNESCAPED_SLASHES | JSON_HEX_TAG | JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_LINE_TERMINATORS
); ?>
</script>
<script defer>
const x = JSON.parse( document.getElementById( 'x' ).textContent );
document.body.textContent = [...x.test].map(c => c.toString(16) ).join('');
</script>

Again, in today's JavaScript all of these cases are fine. And this only applies when certain JSON flags are applied that are not used here.

In all the situations you list we need have the same requirements of the JSON:

  • JSON must be valid for interpretation in JavaScript.
  • JSON must not close the script tag.
  • JSON must not modify the script escaping state (avoid entering double escaped script state) so that the </script> close tag closes the script as intended.

I believe the JSON_HEX_TAG | JSON_UNESCAPED_SLASHES flags proposed here cover all of these. For HTML parsing, when a <script> tag is opened, the parser transitions into script data state. The only ways to exit that state are with a U+003C LESS-THAN SIGN (<) or end-of-file. Escaping < (and >, although this is not necessary) with JSON_HEX_TAG should be sufficient to prevent the script tag from being closed or from transitioning into a script data escape state.

From the JavaScript/JSON perspective, the emitted JSON should be valid, the question of line terminators is not a concern today and the relevant flags are not proposed in this PR.

@dmsnell
Copy link
Member

dmsnell commented Jul 24, 2025

This is really helpful, @sirreal — thank you for verifying with the older browser.

@sirreal
Copy link
Member Author

sirreal commented Aug 13, 2025

There's one additional wrinkle I considered: XHTML. None of the XHTML-related behavior changes in this PR.


XHTML is likely not used today and it would be good to remove support (https://core.trac.wordpress.org/ticket/59883).

The good news is that themes that don't declare HTML5 script support (potentially "supporting XHTML") should work correctly with the encoding as proposed thanks to the CDATA wrappers used.

Also note that this does not change with this PR as the escaping of & remains unchanged. Consider the string <>& (where the characters < and & appear to be invalid):

Invalid XHTML (< and & both problematic):

<?php header('Content-Type: application/xhtml+xml; charset=UTF-8'); ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
	<body>
		<script>
			alert('Invalid: <>&');
		</script>
	</body>
</html>

The following would be valid with the < and & JSON hex escaped. This would require JSON_HEX_AMP. This is not required for HTML5, only for XHTML support without CDATA:

<script>
	alert('escaped: \x3C>\x26');
</script>

WordPress will wrap inline script tag contents with CDATA if HTML5 support is not declared. That looks like this (with the proposed <> hex escape). This is also valid XHTML and the escaping is correct:

<script>
	/* <![CDATA[ */
	alert('CDATA escaped: \u003C\u003E&');
	/* ]]> */
</script>

@sirreal
Copy link
Member Author

sirreal commented Aug 13, 2025

I'd like to move ahead with this PR, I believe it's the correct behavior and this same change should apply to many other places where data is JSON encoded for use in script tags throughout Core.

This test provides an example where certain HTML embedded inside of JSON
may break the script tag from closing.
The test for removing redundant leading slashes is likely redundant itself.
It checks that something that does not require escaping is not escaped.
sirreal added 11 commits August 19, 2025 17:52
This test is correctly checking that a leading forward "/" slash is
prepended to rest routes.
These flags should be safe to use in all contexts.
This reverts commit d630c7d5610ab5ba91e0c304776add3d8daf40ec.
This reverts commit 97edc8bf6a09d526cd715ce6e7c54fdbc8001e59.
Use data that would fail with any dangerous tags
Do not depend on unescaped slashes flag.
@sirreal sirreal requested a review from dmsnell August 19, 2025 15:59
@sirreal sirreal force-pushed the fix/62797-api-fetch-preload-json-encode branch from 85bccee to 1b41f43 Compare August 19, 2025 15:59
Copy link
Member

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates and extra verification @sirreal.

@sirreal sirreal removed the request for review from ockham August 19, 2025 18:41
pento pushed a commit that referenced this pull request Aug 19, 2025
Adds the appropriate JSON flags to `wp_json_encode()` to safely encode data for use in script tags.

Developed in #8145.

Props jonsurrell, bernhard-reiter, dmsnell, artpi, ankitkumarshah, abcd95, dilipbheda, sainathpoojary, shanemuir.
Fixes #62797.


git-svn-id: https://develop.svn.wordpress.org/trunk@60648 602fd350-edb4-49c9-b593-d223f7449a82
@github-actions
Copy link

A commit was made that fixes the Trac ticket referenced in the description of this pull request.

SVN changeset: 60648
GitHub commit: be2b79e

This PR will be closed, but please confirm the accuracy of this and reopen if there is more work to be done.

@github-actions github-actions bot closed this Aug 19, 2025
@sirreal sirreal deleted the fix/62797-api-fetch-preload-json-encode branch August 19, 2025 18:51
markjaquith pushed a commit to markjaquith/WordPress that referenced this pull request Aug 19, 2025
Adds the appropriate JSON flags to `wp_json_encode()` to safely encode data for use in script tags.

Developed in WordPress/wordpress-develop#8145.

Props jonsurrell, bernhard-reiter, dmsnell, artpi, ankitkumarshah, abcd95, dilipbheda, sainathpoojary, shanemuir.
Fixes #62797.

Built from https://develop.svn.wordpress.org/trunk@60648


git-svn-id: http://core.svn.wordpress.org/trunk@59984 1a063a9b-81f0-0310-95a4-ce76da25c4cd
github-actions bot pushed a commit to platformsh/wordpress-performance that referenced this pull request Aug 19, 2025
Adds the appropriate JSON flags to `wp_json_encode()` to safely encode data for use in script tags.

Developed in WordPress/wordpress-develop#8145.

Props jonsurrell, bernhard-reiter, dmsnell, artpi, ankitkumarshah, abcd95, dilipbheda, sainathpoojary, shanemuir.
Fixes #62797.

Built from https://develop.svn.wordpress.org/trunk@60648


git-svn-id: https://core.svn.wordpress.org/trunk@59984 1a063a9b-81f0-0310-95a4-ce76da25c4cd
jonnynews pushed a commit to spacedmonkey/wordpress-develop that referenced this pull request Aug 22, 2025
Adds the appropriate JSON flags to `wp_json_encode()` to safely encode data for use in script tags.

Developed in WordPress#8145.

Props jonsurrell, bernhard-reiter, dmsnell, artpi, ankitkumarshah, abcd95, dilipbheda, sainathpoojary, shanemuir.
Fixes #62797.


git-svn-id: https://develop.svn.wordpress.org/trunk@60648 602fd350-edb4-49c9-b593-d223f7449a82
jonnynews pushed a commit to spacedmonkey/wordpress-develop that referenced this pull request Sep 24, 2025
Adds the appropriate JSON flags to `wp_json_encode()` to safely encode data for use in script tags.

Developed in WordPress#8145.

Props jonsurrell, bernhard-reiter, dmsnell, artpi, ankitkumarshah, abcd95, dilipbheda, sainathpoojary, shanemuir.
Fixes #62797.


git-svn-id: https://develop.svn.wordpress.org/trunk@60648 602fd350-edb4-49c9-b593-d223f7449a82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants