Skip to content

Conversation

@ockham
Copy link
Contributor

@ockham ockham commented Sep 25, 2025

What?

If the caption attribute is empty and bound to a block bindings source, and if that source returns an empty value, remove the <figcaption> element from the block markup upon render.

Why?

In #71483, we started adding the <figcaption> even if the caption attribute was empty, as long as it was bound to a block bindings source. This allows the <figcaption> to be populated with the value from the block bindings source during render.

However, we don't want the <figcaption> to be rendered if the block bindings source returns an empty value. (An empty <figcaption> could create a weird whitespace below an image, as e.g. reported by @justintadlock.)

How?

The HTML API does not yet offer functionality to remove the "outer HTML" of a given tag (i.e. the tag opener followed by all tokens up until and including the matching tag closer). However, we can furnish this functionality by keeping a separate $output buffer to copy each serialized token we encounter as we traverse the markup. (h/t @dmsnell for this tip)

Testing Instructions

  • Create a new post and insert an image block.
  • Select an image for the block to display.
  • Connect the block’s caption attribute to the Post Data source, and select the Publish Date field there.
  • Publish and view the post. It should display the publish date as the image caption. In the page source, verify that it’s a <figcaption> element.
  • Edit the post in the Code Editor. Change the block bindings source to a field that doesn’t exist, e.g. title.
  • Save the post, and view it again.
  • Verify that no caption is shown.
  • In the page source, verify that there’s no empty <figcaption>.

Screencast

figcaption-removal

@ockham ockham self-assigned this Sep 25, 2025
@ockham ockham added [Type] Enhancement A suggestion for improvement. [Block] Image Affects the Image Block [Feature] Block bindings labels Sep 25, 2025
@github-actions
Copy link

github-actions bot commented Sep 25, 2025

Flaky tests detected in 213be74.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/18155751268
📝 Reported issues:

@ockham ockham marked this pull request as ready for review September 25, 2025 15:15
@github-actions
Copy link

github-actions bot commented Sep 25, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: ockham <[email protected]>
Co-authored-by: justintadlock <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: gziolo <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Contributor

@justintadlock justintadlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good and works for me. 👍

@ockham
Copy link
Contributor Author

ockham commented Sep 25, 2025

Thank you very much!

It looks like e2e tests are failing reproducibly, and the test failures are related to the Image block, so I probably didn't get the logic quite right. I'll look into it on Monday!

}
};

$p = $internal_processor_class::create_fragment( $content );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like existing code, but it would be nice to rename $p here to $processor for consistency with other code in Core as example code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll tackle this in a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ockham ockham force-pushed the update/image-block-remove-empty-figcaption branch from 5485aa9 to 714f480 Compare September 29, 2025 10:02
@ockham
Copy link
Contributor Author

ockham commented Sep 30, 2025

I looked into the failing e2e tests. Apparently, when used with pattern overrides, the wp-image- classnames get the wrong image ID appended -- the one from the original image, rather than from the override:

  2) [chromium] › test/e2e/specs/editor/various/pattern-overrides.spec.js:1228:2 › Pattern Overrides › image block classname contains the correct media id and has no data-id attribute when used as a standalone image 

    Error: expect(locator).toHaveAttribute(expected) failed

    Locator: getByAltText('Overridden Image')
    Expected string: "wp-image-246"
    Received string: "wp-image-245"
    Timeout: 5000ms

    Call log:
      - Expect "toHaveAttribute" with timeout 5000ms
      - waiting for getByAltText('Overridden Image')
        9 × locator resolved to <img decoding="async" class="wp-image-245" alt="Overridden Image" src="http://localhost:8889/wp-content/uploads/2025/09/1024x768_e2e_test_image_size.jpeg"/>
          - unexpected value "wp-image-245"


      1284 | 		const imageBlock = page.getByAltText( imageAlt );
      1285 | 		await expect( imageBlock ).not.toHaveAttribute( 'data-id' );
    > 1286 | 		await expect( imageBlock ).toHaveAttribute(
           | 		                           ^
      1287 | 			'class',
      1288 | 			`wp-image-${ overrideImageId }`
      1289 | 		);
        at /home/runner/work/gutenberg/gutenberg/test/e2e/specs/editor/various/pattern-overrides.spec.js:1286:30

I think I'll add a unit test to cover this -- it'd be good to have lower-level coverage rather than rely on the e2e test for this.

@gziolo
Copy link
Member

gziolo commented Sep 30, 2025

Apparently, when used with pattern overrides, the wp-image- classnames get the wrong image ID appended -- the one from the original image, rather than from the override

That's an interesting one. It looks like when the url attribute changes with block bindings, then this class should get removed. id isn't even marked as bindable at the moment.

@ockham
Copy link
Contributor Author

ockham commented Sep 30, 2025

I added unit test coverage that reproduces the e2e test in 4899d29. This test passes on trunk but currently fails on this branch, as it should.

That's an interesting one. It looks like when the url attribute changes with block bindings, then this class should get removed.

AFAICS, the Image block has logic to update the class name to use the ID from the override:

// Ensure the `wp-image-id` classname on the image block supports block bindings.
if ( $has_id_binding ) {
// If there's a mismatch with the 'wp-image-' class and the actual id, the id was
// probably overridden by block bindings. Update it to the correct value.
// See https://github.com/WordPress/gutenberg/issues/62886 for why this is needed.
$id = $attributes['id'];
$image_classnames = $p->get_attribute( 'class' );
$class_with_binding_value = "wp-image-$id";
if ( is_string( $image_classnames ) && ! str_contains( $image_classnames, $class_with_binding_value ) ) {
$image_classnames = preg_replace( '/wp-image-(\d+)/', $class_with_binding_value, $image_classnames );
$p->set_attribute( 'class', $image_classnames );
}
}

id isn't even marked as bindable at the moment.

Hmm, it's included here though 🤔

@gziolo
Copy link
Member

gziolo commented Sep 30, 2025

id isn't even marked as bindable at the moment.

Hmm, it's included here though 🤔

I wasn't aware of that. The complexity comes from the fact that you can mix uploaded images and references through URL.

<!-- wp:image {"sizeSlug":"large"} -->
<figure class="wp-block-image size-large"><img src="https://www.online-image-editor.com/styles/2019/images/devices.png" alt=""/></figure>
<!-- /wp:image -->

vs

<!-- wp:image {"id":5,"sizeSlug":"full","linkDestination":"none"} -->
<figure class="wp-block-image size-full"><img src="http://localhost:8889/wp-content/uploads/2025/09/image.png" alt="" class="wp-image-5"/></figure>
<!-- /wp:image -->

By the way, it's the same image after clicking Upload to the Media Library. I'm not sure what options Pattern Overrides offer, but for regular bindings, you should be able to replace id and/or url seperately.

Copy link
Member

@gziolo gziolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. It would be great to get confirmation from @dmsnell about HTML API integration aspects.


$p = new WP_HTML_Tag_Processor( $content );
$p = new class( $content ) extends WP_HTML_Tag_Processor {
// phpcs:ignore Gutenberg.NamingConventions.ValidBlockLibraryFunctionName.FunctionNameInvalid, Gutenberg.Commenting.SinceTag.MissingMethodSinceTag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not eliminate these ignore statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a @since tag (even though I'm not sure it makes that much sense for a method of an anonymous class) to get rid of MissingMethodSinceTag.

The FunctionNameInvalid is trickier.

The function name 'span_of_empty_element()' is invalid. In this file, PHP function names
must either match one of the allowed prefixes exactly or begin with one of them, followed by
an underscore. The allowed prefixes are: 'block_core_image', 'render_block_core_image',
'register_block_core_image'.

Of all these, I guess only block_core_image is a somewhat reasonable prefix 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wild. the linting rule would prevent overloading the base class methods, wouldn’t it?

$p = new WP_HTML_Tag_Processor( $content );
$p = new class( $content ) extends WP_HTML_Tag_Processor {
// phpcs:ignore Gutenberg.NamingConventions.ValidBlockLibraryFunctionName.FunctionNameInvalid, Gutenberg.Commenting.SinceTag.MissingMethodSinceTag
public function span_of_empty_element() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading the code it seems like the goal of this function is to return the textual span across which an empty element appears, or false if there are other elements inside it.

there’s a question in here whether HTML comments should invalidate an empty element or not and whether void tags count. to that end, I feel like it’s okay to leave this specialized.

we can also lean on next_token() since we expect an empty element. if there’s no closing tag this iteration will halt immediately whereas next_tag() will scan the entire document until the end.

/**
 * Returns span of input for an empty FIGCAPTION, if currently matched on a
 * FIGCAPTION opening tag and if the element is properly closed and empty.
 */
public function extract_empty_figcaption_element() {
	$this->set_bookmark( 'here' );
	$opener = $this->bookmarks['here'];

	// Allow comments within the definition of “empty.”
	while ( $this->next_token() && '#comment' === $this->get_token_name() ) {
		continue;
	}

	if ( 'FIGCAPTION' !== $this->get_tag() || ! $this->is_tag_closer() ) {
		return false;
	}

	$this->set_bookmark( 'here' );
	$closer = $this->bookmarks['here'];

	return new WP_HTML_Span( $opener->start, $closer->start + $closer->length - $opener->start );
}

Being an anonymous class within a specific function helps with the guards that otherwise would be loose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading the code it seems like the goal of this function is to return the textual span across which an empty element appears, or false if there are other elements inside it.

That's exactly right. I had to tweak the code in response to a regression that Grzegorz found, where the code accidentally removed <figcaption>s with elements inside of them.

there’s a question in here whether HTML comments should invalidate an empty element or not and whether void tags count.

Yeah, I was debating that for a moment myself. I had an HTML Processor based version first and was considering covering those edge cases. Eventually, I succumbed to the temptation to use the Tag Processor -- based on the assumption that we control the saved markup and can probably rule out "pathological" cases. In the process, I removed all calls to next_token(), since TBH I thought it wasn't available from the Tag Processor 🙈

to that end, I feel like it’s okay to leave this specialized.

I've updated the code based on your suggestion: 7adf0b6. Thank you!


return $p->get_updated_html();
$output = $p->get_updated_html();
if ( ! empty( $figcaption_span ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is 100% fine and safe. still, I find if ( isset( $figcaption_span ) ) clearer from a semantic point of view. empty() makes me think of content whereas isset() makes me think of assignment.

well, ! empty() does have a marginal and insignificant performance implication that it performs an additional negation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to cover the case that $figcaption_span is set but false (which the anonymous class method returns if the <figcaption> is empty).

I had if ( isset( $figcaption_span ) && false !== $figcaption_span ) (or maybe if ( isset( $figcaption_span ) && $figcaption_span )) but remembered that that's equivalent to ! empty().

(I've found the latter to be quite widespread across the WordPress codebase in cases like these, and even though it took me some getting used to at first, I'm now accustomed enough with this little idiosyncracy that I find myself using it 😅 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it returned null instead of false it wouldn’t have the awkward mixed types 🤷‍♂️

no big deal.

@ockham ockham force-pushed the update/image-block-remove-empty-figcaption branch from 7adf0b6 to 262995b Compare October 2, 2025 09:36
@ockham ockham merged commit c0e009e into trunk Oct 2, 2025
69 checks passed
@ockham ockham deleted the update/image-block-remove-empty-figcaption branch October 2, 2025 11:52
@github-actions github-actions bot added this to the Gutenberg 21.9 milestone Oct 2, 2025
@ockham ockham added the Backport to Gutenberg RC Pull request that needs to be backported to a Gutenberg release candidate (RC) label Oct 2, 2025
@ockham ockham modified the milestones: Gutenberg 21.9, Gutenberg 21.8 Oct 2, 2025
@ockham
Copy link
Contributor Author

ockham commented Oct 2, 2025

I'd like to include this in GB 21.8. I'll cherry-pick it to the corresponding branch.

ockham added a commit that referenced this pull request Oct 2, 2025
Remove the `<figcaption>` element from the block markup upon render if the `caption` attribute is empty.

Since `caption` is a sourced attribute, this requires checking if the `<figcaption>` has any inner elements. If it doesn't, it's still possible that the `caption` attribute is connected to a block bindings source, so we also need to rule out that it received a non-empty value from that source. If neither criterion is met, we remove the `<figcaption>`.

Co-authored-by: ockham <[email protected]>
Co-authored-by: justintadlock <[email protected]>
Co-authored-by: dmsnell <[email protected]>
Co-authored-by: gziolo <[email protected]>
@ockham
Copy link
Contributor Author

ockham commented Oct 2, 2025

I'd like to include this in GB 21.8. I'll cherry-pick it to the corresponding branch.

Done in 5884fa4.

@ockham ockham removed the Backport to Gutenberg RC Pull request that needs to be backported to a Gutenberg release candidate (RC) label Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Block] Image Affects the Image Block [Feature] Block bindings [Type] Enhancement A suggestion for improvement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants