Skip to content

Markdown: Add source:markdown element to RSS feeds#46968

Open
jeherve wants to merge 28 commits intotrunkfrom
add/markdown-rss-source-element
Open

Markdown: Add source:markdown element to RSS feeds#46968
jeherve wants to merge 28 commits intotrunkfrom
add/markdown-rss-source-element

Conversation

@jeherve
Copy link
Member

@jeherve jeherve commented Feb 5, 2026

Warning

This is on hold for now, pending more general discussion and potential change of direction.
pgle0O-1sG-p2

Fixes CM-533

Proposed changes:

This adds a <source:markdown> element to RSS2 feed items for posts written with either the legacy Markdown module or the Markdown block (jetpack/markdown). The element provides the raw Markdown source alongside the rendered HTML, following the convention described at source.scripting.com/markdown.

A source XML namespace declaration (xmlns:source="https://source.scripting.com/") is added to RSS2 feeds via the rss2_ns hook.

Legacy Markdown module (modules/markdown/)

For posts written with the legacy Markdown module, the raw Markdown is read from post_content_filtered. For all other posts, the rendered post_content (with the_content filters applied) is used as a fallback, so the element is always present for RSS readers to consume.

Markdown block (extensions/blocks/markdown/)

For posts containing jetpack/markdown blocks, the function uses WP_Block_Processor (WP 6.9+) to scan the post content and extract raw Markdown from each block's source attribute. Non-Markdown blocks are rendered through the_content filters, producing a hybrid document where Markdown blocks contribute raw source and everything else contributes rendered HTML.

On WordPress versions older than 6.9 (where WP_Block_Processor is unavailable), the block-level RSS output gracefully degrades — no source:markdown element is emitted for block-based posts.

Architecture

  • New file _inc/lib/markdown/rss.php with four functions:
    • jetpack_markdown_rss_namespace() — outputs the xmlns:source namespace declaration once (static dedup guard).
    • jetpack_markdown_rss_post_has_markdown_block() — shared helper using WP_Block_Processor to detect jetpack/markdown blocks.
    • jetpack_markdown_block_rss_output_source_markdown() — block path: placeholder substitution + content rendering + raw MD splice.
    • jetpack_markdown_rss_output_source_markdown() — legacy path: raw MD from post_content_filtered, or rendered fallback.
  • modules/markdown.php — loads rss.php (via require_once) and hooks the legacy function.
  • extensions/blocks/markdown/markdown.php — loads rss.php (via require_once) and hooks the block function, guarded by class_exists( 'WP_Block_Processor' ).
  • Mutual exclusion: the legacy function bails if jetpack/markdown blocks are detected, deferring to the block function.
  • rss.php is decoupled from the WPCom_Markdown class (uses the string literal '_wpcom_is_markdown' instead of the class constant) so it works from both entry points.

Alternative

For a standalone implementation that works outside the Jetpack plugin, see a8cteam51/team51-markdown-rss.

Other information:

  • Have you written new tests for your changes, if applicable?
  • Have you checked the E2E test CI results, and verified that your changes do not break them?
  • Have you tested your changes on WordPress.com, if applicable (if so, you'll see a generated comment below with a script to run)?

Does this pull request change what data or activity we track or use?

No. This only adds output to existing RSS feeds; no new data is collected or stored.

Testing instructions:

Automated tests

  1. Run the Markdown test suite:
    jetpack docker phpunit jetpack -- --testsuite=markdown
    
  2. All 16 tests should pass, including the new Markdown_RSS_Test tests.

Manual testing — Legacy Markdown module

  1. Ensure the Jetpack plugin is active with the Markdown module enabled (Jetpack > Settings > Writing > Compose using Markdown syntax).
  2. Create a new post using the Classic Editor or with a Classic block, and write some Markdown content (e.g., # Hello\n\nThis is **bold**.). Publish it.
  3. Create a second post with regular block content (no Markdown). Publish it.
  4. Visit your site's RSS2 feed at /feed.
  5. Verify the XML output:
    • The <rss> element should include xmlns:source="https://source.scripting.com/".
    • The Markdown post's <item> should contain a <source:markdown> element with the raw Markdown inside a CDATA section.
    • The non-Markdown post's <item> should contain a <source:markdown> element with the rendered HTML content (no Gutenberg block comments).

Manual testing — Markdown block

  1. Create a new post using the Block Editor. Add a Markdown block (jetpack/markdown) and write some Markdown content. Optionally add other blocks (paragraphs, headings) around it. Publish it.
  2. Visit your site's RSS2 feed at /feed.
  3. Verify the XML output:
    • The post's <item> should contain a <source:markdown> element.
    • Inside the CDATA section, the Markdown block's content should appear as raw Markdown, while other blocks should appear as rendered HTML.

Edge cases

  • Create a post with content containing ]]> and verify it appears escaped as ]]&gt; in the CDATA output.
  • Create a post that uses both the legacy Markdown module AND a Markdown block — the block function should handle it (legacy function should bail).

- Use get_post() directly instead of get_the_ID() + get_post($id)
- Use WPCom_Markdown::IS_MD_META constant instead of hardcoded string
- Add assertNotEmpty on CDATA extraction in test to prevent false passes
- Replace printf with echo concatenation to prevent garbled output
  when Markdown content contains %s, %d, or other format specifiers
- Add test for printf format specifier preservation
- Strengthen no-meta test by providing non-empty post_content_filtered
  so it truly validates the meta check independently
Move the source namespace xmlns declaration from an inline anonymous
function in markdown.php to a named jetpack_markdown_rss_namespace()
function in rss.php alongside the existing RSS output function.
RSS readers expect the element to be consistently present. Instead of
only emitting source:markdown for Markdown posts, always include it:
use raw Markdown from post_content_filtered when available, otherwise
fall back to the rendered post_content.
Apply the_content filters when falling back to post_content so
Gutenberg block markup is rendered into clean HTML instead of
serving raw block comments to RSS readers.
@jeherve jeherve added [Status] Needs Review This PR is ready for review. Enhancement labels Feb 5, 2026
@jeherve jeherve self-assigned this Feb 5, 2026
Copilot AI review requested due to automatic review settings February 5, 2026 10:38
@jeherve jeherve added [Status] Needs Review This PR is ready for review. Enhancement labels Feb 5, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.

  • To test on WoA, go to the Plugins menu on a WoA dev site. Click on the "Upload" button and follow the upgrade flow to be able to upload, install, and activate the Jetpack Beta plugin. Once the plugin is active, go to Jetpack > Jetpack Beta, select your plugin (Jetpack), and enable the add/markdown-rss-source-element branch.
  • To test on Simple, run the following command on your sandbox:
bin/jetpack-downloader test jetpack add/markdown-rss-source-element

Interested in more tips and information?

  • In your local development environment, use the jetpack rsync command to sync your changes to a WoA dev blog.
  • Read more about our development workflow here: PCYsg-eg0-p2
  • Figure out when your changes will be shipped to customers here: PCYsg-eg5-p2

@github-actions github-actions bot added [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Tests] Includes Tests labels Feb 5, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Thank you for your PR!

When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:

  • ✅ Include a description of your PR changes.
  • ✅ Add a "[Status]" label (In Progress, Needs Review, ...).
  • ✅ Add testing instructions.
  • ✅ Specify whether this PR includes any changes to data or privacy.
  • ✅ Add changelog entries to affected projects

This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖


Follow this PR Review Process:

  1. Ensure all required checks appearing at the bottom of this PR are passing.
  2. Make sure to test your changes on all platforms that it applies to. You're responsible for the quality of the code you ship.
  3. You can use GitHub's Reviewers functionality to request a review.
  4. When it's reviewed and merged, you will be pinged in Slack to deploy the changes to WordPress.com simple once the build is done.

If you have questions about anything, reach out in #jetpack-developers for guidance!


Jetpack plugin:

The Jetpack plugin has different release cadences depending on the platform:

  • WordPress.com Simple releases happen as soon as you deploy your changes after merging this PR (PCYsg-Jjm-p2).
  • WoA releases happen weekly.
  • Releases to self-hosted sites happen monthly:
    • Scheduled release: March 3, 2026
    • Code freeze: March 3, 2026

If you have any questions about the release process, please ask in the #jetpack-releases channel on Slack.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a <source:markdown> element to RSS feeds when the Jetpack Markdown module is active, following the convention described at source.scripting.com. For posts written with the legacy Markdown module, the raw Markdown source is extracted from post_content_filtered. For all other posts, the rendered HTML content (with the_content filters applied and Gutenberg block comments stripped) is used as a fallback, ensuring the element is always present for RSS readers.

Changes:

  • New RSS library file providing namespace declaration and source:markdown output functions
  • Module initialization updated to register RSS hooks when Markdown is active
  • Comprehensive PHPUnit test suite covering edge cases and escaping scenarios

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
_inc/lib/markdown/rss.php New library file implementing the RSS namespace declaration and source:markdown output logic with proper CDATA escaping
modules/markdown.php Module initialization updated to require the RSS library and register hooks for RSS feeds (rss2_ns, rss_item, rss1_item, rss2_item)
tests/php/modules/markdown/Markdown_RSS_Test.php Comprehensive test suite covering Markdown posts, non-Markdown fallback, empty content, CDATA escaping, printf specifiers, and namespace declaration
changelog/add-markdown-rss-source-element Changelog entry following project conventions, describing the enhancement from a user perspective

@jp-launch-control
Copy link

jp-launch-control bot commented Feb 5, 2026

Code Coverage Summary

Coverage changed in 4 files.

File Coverage Δ% Δ Uncovered
projects/plugins/jetpack/extensions/blocks/markdown/markdown.php 0/8 (0.00%) 0.00% 4 💔
projects/plugins/jetpack/modules/markdown.php 0/11 (0.00%) 0.00% 4 💔
projects/plugins/jetpack/class.jetpack-twitter-cards.php 0/121 (0.00%) 0.00% -1 💚
projects/plugins/jetpack/modules/sitemaps/sitemaps.php 82/220 (37.27%) 2.73% -6 💚

1 file is newly checked for coverage.

File Coverage
projects/plugins/jetpack/_inc/lib/markdown/rss.php 58/64 (90.63%) 💚

Full summary · PHP report · JS report

Coverage check overridden by I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. .

Replace WPCom_Markdown::IS_MD_META with the string literal
'_wpcom_is_markdown' so rss.php works when loaded from the
Markdown block entry point where the legacy class is absent.
Remove the now-unnecessary easy-markdown.php require from tests.
Copilot AI review requested due to automatic review settings February 5, 2026 16:27
Complements the existing bail-guard test by verifying that the
block function produces correct output for a post that has both
_wpcom_is_markdown meta and jetpack/markdown blocks.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

The duplicate call to jetpack_markdown_rss_namespace() is
intentional — it verifies the static deduplication guard.
Copilot AI review requested due to automatic review settings February 5, 2026 17:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

@jeherve jeherve added the I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. label Feb 5, 2026
@jeherve
Copy link
Member Author

jeherve commented Feb 6, 2026

@anomiex How would I go about solving those Phan issues? Should I create a stub for the class, since it's only present in WordPress 6.9? I thought the class_exists and the Phan comments would be enough, but it doesn't seem to solve the issues.

Thank you!

@anomiex
Copy link
Contributor

anomiex commented Feb 9, 2026

Unfortunately Phan doesn't realize that class_exists means the class exists, so suppression comments are needed too. Even if it did, the logic here is complex enough that it might not realize it for all the stuff in jetpack_markdown_block_rss_output_source_markdown().

In this case, because the suppression comments are only needed for the run with WP 6.8 stubs, you have to get even more complicated and suppress the unused suppression error for the WP 6.9 run too, with something like

// @phan-suppress-next-line PhanUndeclaredMethod @phan-suppress-current-line UnusedSuppression -- We checked that the class exists earlier. @todo Remove this suppression when we drop WP <6.9.

Should I create a stub for the class, since it's only present in WordPress 6.9?

You could, but that might mask the problem if code anywhere else starts trying to use the class without checking it exists first.

WP_Block_Processor is only available in WP 6.9+, so Phan flags it as
undeclared when running against WP 6.8 stubs. Add dual suppression
comments that also suppress the resulting UnusedSuppression warning
on the WP 6.9 run.
Copilot AI review requested due to automatic review settings February 11, 2026 18:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Comment on lines +132 to +134
// Render all non-Markdown content through the standard pipeline.
$rendered = apply_filters( 'the_content', $modified_content );

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this output is specifically for RSS2 feeds, consider applying the feed-specific filter after rendering (i.e., run the_content_feed for 'rss2' similar to how core prepares feed content). That helps keep output consistent with what WordPress generates in feeds and avoids surprises from feed-only transforms.

Copilot uses AI. Check for mistakes.
$content = $post->post_content_filtered;
} elseif ( ! empty( $post->post_content ) ) {
// Apply the_content filters to render Gutenberg blocks and shortcodes into clean HTML.
$content = apply_filters( 'the_content', $post->post_content );
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rendered-HTML fallback in feeds, it may be better to mirror core feed preparation by applying the_content_feed for 'rss2' after the_content (see e.g. modules/sitemaps/sitemap-builder.php where content is prepared like get_the_content_feed()). This keeps the custom element aligned with feed output expectations.

Suggested change
$content = apply_filters( 'the_content', $post->post_content );
$content = apply_filters( 'the_content', $post->post_content );
// Mirror core feed preparation by applying the_content_feed for RSS2 feeds.
$content = apply_filters( 'the_content_feed', $content, 'rss2' );

Copilot uses AI. Check for mistakes.
Comment on lines +332 to +334
$this->assertLessThan( $pos_para, $pos_md1, 'First markdown should appear before the paragraph.' );
$this->assertLessThan( $pos_md2, $pos_para, 'Paragraph should appear before the second markdown.' );

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mixed-blocks test, the order assertions are inverted: assertLessThan( $pos_para, $pos_md1 ) currently asserts the paragraph appears before the first markdown, but the message says the opposite. Same for the paragraph vs second markdown assertion. This will either fail or validate the wrong ordering—swap the arguments so the assertions match the intended order (md1 < paragraph < md2).

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +85
if ( ! jetpack_markdown_rss_post_has_markdown_block( $post->post_content ) ) {
return;
}

// First pass: find Markdown blocks, extract sources, record byte offsets.
// @phan-suppress-next-line PhanUndeclaredClassMethod @phan-suppress-current-line UnusedSuppression -- We checked that the class exists above. @todo Remove when we drop WP <6.9.
$processor = new WP_Block_Processor( $post->post_content );
$sources = array();
$regions = array(); // Each entry: array( 'start' => int, 'end' => int ).
$index = 0;

// @phan-suppress-next-line PhanUndeclaredClassMethod @phan-suppress-current-line UnusedSuppression -- We checked that the class exists above. @todo Remove when we drop WP <6.9.
while ( $processor->next_block( 'jetpack/markdown' ) ) {
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jetpack_markdown_block_rss_output_source_markdown() scans for Markdown blocks twice: first via jetpack_markdown_rss_post_has_markdown_block() (which instantiates/scans a WP_Block_Processor), then again by creating a new processor and walking all markdown blocks. You can avoid the extra pass by dropping the initial helper check and just running the extraction loop; if no sources are found, return early.

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +137
// Build modified content with placeholders replacing Markdown blocks.
$modified_content = '';
$cursor = 0;

foreach ( $regions as $i => $region ) {
// Append content before this block.
$modified_content .= substr( $post->post_content, $cursor, $region['start'] - $cursor );
// Insert placeholder.
$modified_content .= '%%JETPACK_MARKDOWN_' . $i . '%%';
$cursor = $region['end'];
}

// Append any remaining content after the last block.
$modified_content .= substr( $post->post_content, $cursor );

// Render all non-Markdown content through the standard pipeline.
$rendered = apply_filters( 'the_content', $modified_content );

// Substitute placeholders with raw Markdown sources.
foreach ( $sources as $i => $source ) {
$rendered = str_replace( '%%JETPACK_MARKDOWN_' . $i . '%%', $source, $rendered );
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placeholder approach (%%JETPACK_MARKDOWN_%d%% + apply_filters( 'the_content', ... ) + str_replace) can result in the raw Markdown source being wrapped/altered by the_content filters (notably wpautop), since the placeholder is plain text in the content stream. After replacement, the Markdown can end up inside generated HTML (e.g., <p>...</p>), which means it’s no longer truly “raw Markdown” in the output. Consider building the output by concatenating rendered non-markdown segments/blocks with raw markdown segments (e.g., render non-markdown blocks via block rendering, and insert markdown sources directly) instead of relying on placeholders that go through the_content.

Suggested change
// Build modified content with placeholders replacing Markdown blocks.
$modified_content = '';
$cursor = 0;
foreach ( $regions as $i => $region ) {
// Append content before this block.
$modified_content .= substr( $post->post_content, $cursor, $region['start'] - $cursor );
// Insert placeholder.
$modified_content .= '%%JETPACK_MARKDOWN_' . $i . '%%';
$cursor = $region['end'];
}
// Append any remaining content after the last block.
$modified_content .= substr( $post->post_content, $cursor );
// Render all non-Markdown content through the standard pipeline.
$rendered = apply_filters( 'the_content', $modified_content );
// Substitute placeholders with raw Markdown sources.
foreach ( $sources as $i => $source ) {
$rendered = str_replace( '%%JETPACK_MARKDOWN_' . $i . '%%', $source, $rendered );
// Build content by concatenating rendered non-Markdown segments with raw Markdown sources.
$rendered = '';
$cursor = 0;
foreach ( $regions as $i => $region ) {
// Render and append content before this Markdown block.
$before = substr( $post->post_content, $cursor, $region['start'] - $cursor );
if ( '' !== $before ) {
$rendered .= apply_filters( 'the_content', $before );
}
// Append the raw Markdown source for this block without running it through the_content filters.
if ( isset( $sources[ $i ] ) ) {
$rendered .= $sources[ $i ];
}
$cursor = $region['end'];
}
// Render and append any remaining non-Markdown content after the last block.
$after = substr( $post->post_content, $cursor );
if ( '' !== $after ) {
$rendered .= apply_filters( 'the_content', $after );

Copilot uses AI. Check for mistakes.
// Add source:markdown element to RSS feeds for posts containing the Markdown block.
require_once JETPACK__PLUGIN_DIR . '_inc/lib/markdown/rss.php';
if ( class_exists( 'WP_Block_Processor' ) ) {
add_action( 'rss2_ns', 'jetpack_markdown_rss_namespace' );
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will register jetpack_markdown_rss_namespace on rss2_ns even if it was already added by the Markdown module, causing the callback to run twice per feed render (the static guard prevents duplicate output but still adds overhead). Mirror the has_action( 'rss2_ns', ... ) guard used in modules/markdown.php here as well.

Suggested change
add_action( 'rss2_ns', 'jetpack_markdown_rss_namespace' );
if ( ! has_action( 'rss2_ns', 'jetpack_markdown_rss_namespace' ) ) {
add_action( 'rss2_ns', 'jetpack_markdown_rss_namespace' );
}

Copilot uses AI. Check for mistakes.
Match WordPress core's get_the_content_feed() two-step sequence:
first the_content, then the_content_feed. This ensures feed-specific
transforms (e.g. relative-to-absolute URL conversion) are applied to
the rendered HTML in the source:markdown element.
The extraction loop already handles the "no blocks" case via
an early return on empty $sources, making the prior
jetpack_markdown_rss_post_has_markdown_block() check a wasted
processor instantiation + scan.
Copilot AI review requested due to automatic review settings February 11, 2026 19:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +117 to +122
foreach ( $regions as $i => $region ) {
// Append content before this block.
$modified_content .= substr( $post->post_content, $cursor, $region['start'] - $cursor );
// Insert placeholder.
$modified_content .= '%%JETPACK_MARKDOWN_' . $i . '%%';
$cursor = $region['end'];
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placeholder token format (e.g. %%JETPACK_MARKDOWN_0%%) can collide with real post content. If a post contains the same string, str_replace() will replace user content unexpectedly. Consider using a per-request unique placeholder prefix (e.g. UUID/random bytes) and doing a single pass replacement (e.g. strtr) to avoid accidental collisions.

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +167
// If the post contains Markdown blocks, let the block function handle it.
if (
class_exists( 'WP_Block_Processor' )
&& jetpack_markdown_rss_post_has_markdown_block( $post->post_content )
) {
return;
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On WP versions where WP_Block_Processor is unavailable, the legacy rss2_item handler will not bail for posts containing jetpack/markdown blocks (because the block-detection helper always returns false). If the Markdown module is active, those block-based posts will still get a <source:markdown> element containing rendered HTML, which conflicts with the PR description’s stated behavior (“no element is emitted for block-based posts” on <6.9). If the intent is truly to emit nothing for block posts on <6.9, consider falling back to has_block( 'jetpack/markdown', $post->post_content ) for detection when WP_Block_Processor is missing, and bail in that case too (or update the PR description to match the actual fallback behavior).

Copilot uses AI. Check for mistakes.
Use regex assertions to account for class attributes (e.g.
wp-block-paragraph) that the block renderer adds to <p> tags
when apply_filters('the_content') runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Block] Markdown [Feature] Markdown I don't care about code coverage for this PR Use this label to ignore the check for insufficient code coveage. [Plugin] Jetpack Issues about the Jetpack plugin. https://wordpress.org/plugins/jetpack/ [Pri] Low [Status] Proposal [Tests] Includes Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants