Pasting: Respect plain-text pastes #3609

mcsf · 2017-11-22T11:50:55Z

Partly fixes #3062. This change only affects shortcodes that are handled by registered blocks via transforms.

Description

When pasting plain-text-only clipboard data, use that plain-text data in the paste processing. This overrides the default behavior whereby an HTML-encoded counterpart of the plain text is used for processing.

Previously, pasting the following text as a plain-text attachment:

[gallery ids="1,2,3"]

Would be processed as:

[gallery ids=&quot;1,2,3&quot;]

Thus affecting the correct parsing of shortcodes. With this change, parsing of shortcodes works when pasting a plain-text object — like the one above — but also when pasting rich-text objects. For instance, copying a shortcode snippet from a rich-text page, even if not wrapped with a <pre>, will be handled properly and yield a valid Gallery block:

<meta charset='utf-8'><span style="…">[gallery ids="10,11,9"]</span>

How Has This Been Tested?

Either manually or using the Classic Editor, compose a gallery shortcode comprised of a few images, e.g. [gallery ids="1,2,3"].
Copy that code from a source that will provide it as a plain-text-only clipboard object, e.g. a terminal or a basic HTML input.
Prior to application of this diff, verify that pasting that data into an empty Paragraph block results in a Gallery block that's missing at least one image.¹
After application of diff, verify that the same experience results in a Gallery block comprised of all the expected images.
Grab the same shortcode but from an HTML-formatted source, e.g. http://txti.es/kph06. Repeat the experiment and verify that HTML pastes still work.

¹: This is because the Gallery shortcode transform received the named ids attribute as '"1,2,3", which gets split as [ '"1', '2', '3"' ], then mapped over parseInt(_, 10), which evals to [ NaN, 2, 3 ].

Screenshots (jpeg or gifs if applicable):

Checklist:

My code is tested.
My code follows the WordPress code style.
My code follows has proper inline documentation.

codecov · 2017-11-22T11:50:58Z

Codecov Report

Merging #3609 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #3609      +/-   ##
==========================================
- Coverage   37.47%   37.45%   -0.02%     
==========================================
  Files         277      277              
  Lines        6704     6706       +2     
  Branches     1220     1222       +2     
==========================================
  Hits         2512     2512              
  Misses       3535     3535              
- Partials      657      659       +2

Impacted Files	Coverage Δ
blocks/editable/index.js	`10.16% <0%> (-0.07%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 139dbcd...086dd25. Read the comment docs.

mtias · 2017-11-22T12:27:39Z

Nice debugging here.

ellatrix · 2017-11-22T18:40:29Z

I'm trying to reproduce this issue, but am failing, though I see the HTML entities encoded. The reason seems to be that, if HTML entities are encoded, it is seen as a plain text paste, and passed through the markdown converter, which decodes all these. By the time the shortcode arrives at the shortcode converter, HTML entities will be decoded. I'm not sure then in which circumstances this would not be the case?

ellatrix · 2017-11-22T18:41:16Z

(Just trying to understand everything that is going on here.)

ellatrix · 2017-11-22T18:50:05Z

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check? Things seem to work correctly with that check removed. This check is added though to ensure just one string without line breaks is pasted inline, not parsed as blocks. The markdown converter would add P tags, then later it would convert to a paragraph block.

ellatrix · 2017-11-22T18:56:03Z

There are some other related inline content issues where we're just checking too early if content is inline. Content that is later stripped (e.g. spans) is not caught as inline this early, so it won't be pasted inline. This makes me wonder if we should rethink this whole inline pasting thing to check at a later point.

mcsf · 2017-11-22T19:00:48Z

Thanks for having a look, you know much more about this than I do.

Oh, okay, this is without any other content. So it seems the source of this bug is the plainText.indexOf( '\n\n' ) !== -1 check?

That makes a lot of sense. Do you see that as entirely supplanting the current fix, or should they be complements?

ellatrix · 2017-11-22T19:23:35Z

Do you see that as entirely supplanting the current fix, or should they be complements?

Yes, it could be alternative fix, as long as it's maybe clearer commented all the things the markdown converter is doing, in this case decoding HTML entities for plain text paste.

ephox-mogran · 2017-11-23T00:19:33Z

OK .. so I've looked into this and have some questions / comments:

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

Looking at the pasting process, the general path seems to be:

Firstly, store any plain text data on the clipboard as this.pastedPlainText
If there is selected text and selected text is not a link, and there is a link on the clipboard, add a link to the current text with the clipboard link and stop any other pasting

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

Identify the mode of pasting. There are three possible modes

a) INLINE: skips initial block parsing, allows for markdown conversion of plain text and always returns a string for TinyMCE to paste
b) AUTO: will parse as blocks if block delimiters detected, otherwise acts as INLINE if the HTML only has inline elements
c) BLOCKS: will parse as blocks if block delimiters detected, allows for markdown conversion of plain text and will always returns blocks

So for the markdown conversion of plain text, it will only take place if there is at least a double line break as @iseulde said. My understanding of why that check is there is to ensure that wrapping the output in paragraph tags isn't going to stop it being inline. Someone asked showdown to support not wrapping with an outer root node here:

showdownjs/showdown#189

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

const processed = converter.makeHtml( plainText );
return plainText.indexOf( '\n\n' ) !== -1 ? processed : stripOuterPTag( processed );

mcsf · 2017-11-23T14:43:55Z

Thanks for the feedback, @ephox-mogran.

Q. If I copy and paste this *italic* text into the editor from a terminal should it convert it to italics? This solution won't, because it doesn't run through the showdown converter (since it does not contain two new line characters). However, I'm not sure if that's desired.

It could be beneficial to handle those pastes as Markdown too. Your suggestion:

And it looks like the answer is to use extensions. You could also just postprocess the HTML if you detected that there weren't linebreaks in the original (and remove the check in the if)

could be the way to go.

Note, this is an interesting user experience. Where did you get the idea for this kind of behaviour? Is there another product that does it or is it just innovation? Have you thought about ways of making people aware of it?

I don't believe I was around when it got in Gutenberg, but it's been a feature of P2 for as long as I can remember.

mcsf · 2017-11-24T11:05:51Z

Actually, a problem with allowing Markdown parsing for single-line pastes (which includes our shortcode examples) is that a shortcode like [su_spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler] gets picked up and rendered as [su<em>spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler]. Therefore, I didn't push that change to the branch, but you can grab it as a gist.

For pastes of a single shortcode, it's trivial to work around this by testing the paste string against wp.shortcode. However, when pasting a fragment of a document, which could as likely be Markdown-formatted and shortcode-infused, which should be favored? In a way, Markdown and WP shortcodes may be fundamentally incompatible, as bracket notation is an important and differing aspect of each. Dealing with this isn't computationally infeasible, but likely impractical for what we want to achieve at this stage of Gutenberg.

I still believe the current fix to be useful. Do you both agree?

ellatrix · 2017-11-24T14:16:17Z

Hm, I think they could be compatible, but the markdown would need to run after splitting the shortcodes? In other words, shortcodes need to be done above all else. I’ll attempt to have a closer look. As previously commented, I think there are a few other things that need to be delayed. Apologies that it takes so long for me to look into.

…

On 24 Nov 2017, at 12:05, Miguel Fonseca ***@***.***> wrote: Actually, a problem with allowing Markdown parsing for single-line pastes (which includes our shortcode examples) is that a shortcode like [su_spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler] gets picked up and rendered as [su<em>spoiler open="yes" style="fancy" icon="chevron"]Hidden content[/su_spoiler]. Therefore, I didn't push that change to the branch, but you can grab it as a gist. For pastes of a single shortcode, it's trivial to work around this by testing the paste string against wp.shortcode. However, when pasting a fragment of a document, which could as likely be Markdown-formatted and shortcode-infused, which should be favored? In a way, Markdown and WP shortcodes may be fundamentally incompatible, as bracket notation is an important and differing aspect of each. Dealing with this isn't computationally infeasible, but likely impractical for what we want to achieve at this stage of Gutenberg. I still believe the current fix to be useful. Do you both agree? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

When pasting plain-text-only clipboard data, use that plain-text data in the paste processing. This overrides the default behavior whereby an HTML-encoded counterpart of the plain text is used for processing. Previously, pasting the following text as a plain-text attachment: [gallery ids="1,2,3"] Would be processed as: [gallery ids="1,2,3"] Thus affecting the correct parsing of shortcodes. With this change, parsing of shortcodes works when pasting a plain-text object — like the one above — but also when pasting rich-text objects. For instance, copying a shortcode snippet from a rich-text page, even if not wrapped with a `<pre>`, will be handled properly and yield a valid Gallery block: <meta charset='utf-8'><span style="…">[gallery ids="10,11,9"]</span>

mcsf · 2017-11-28T16:01:12Z

Let's get this one and #3610 in so that more eyeballs can make all the bugs shallow.

mcsf added the [Feature] Paste label Nov 22, 2017

mcsf requested review from ellatrix, ephox-mogran and youknowriad November 22, 2017 11:50

mcsf mentioned this pull request Nov 22, 2017

Pasting: Convert unknown shortcodes to Shortcode block #3610

Merged

3 tasks

mcsf force-pushed the fix/shortcode-pasting-plain-text branch from 174e6b9 to 086dd25 Compare November 28, 2017 15:45

mcsf merged commit d5d7064 into master Nov 28, 2017

mtias deleted the fix/shortcode-pasting-plain-text branch November 28, 2017 18:23

mcsf mentioned this pull request Feb 7, 2018

Blocks: Shortcode: Support multi-line shortcodes #4942

Closed

3 tasks

Pasting: Respect plain-text pastes #3609

Pasting: Respect plain-text pastes #3609

Uh oh!

Conversation

mcsf commented Nov 22, 2017

Description

How Has This Been Tested?

Screenshots (jpeg or gifs if applicable):

Checklist:

Uh oh!

codecov bot commented Nov 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mtias commented Nov 22, 2017

Uh oh!

ellatrix commented Nov 22, 2017

Uh oh!

ellatrix commented Nov 22, 2017

Uh oh!

ellatrix commented Nov 22, 2017

Uh oh!

ellatrix commented Nov 22, 2017

Uh oh!

mcsf commented Nov 22, 2017

Uh oh!

ellatrix commented Nov 22, 2017

Uh oh!

ephox-mogran commented Nov 23, 2017

Uh oh!

mcsf commented Nov 23, 2017

Uh oh!

mcsf commented Nov 24, 2017

Uh oh!

ellatrix commented Nov 24, 2017 via email

Uh oh!

mcsf commented Nov 28, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Nov 22, 2017 •

edited

Loading