-
Notifications
You must be signed in to change notification settings - Fork 847
Don't export wp_navigation posts by non-current users #43910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't export wp_navigation posts by non-current users #43910
Conversation
|
Are you an Automattician? Please test your changes on all WordPress.com environments to help mitigate accidental explosions.
Interested in more tips and information?
|
|
Thank you for your PR! When contributing to Jetpack, we have a few suggestions that can help us test and review your patch:
This comment will be updated as you work on your PR and make changes. If you think that some of those checks are not needed for your PR, please explain why you think so. Thanks for cooperation 🤖 Follow this PR Review Process:
If you have questions about anything, reach out in #jetpack-developers for guidance! |
...ackages/jetpack-mu-wpcom/src/features/wpcom-navigation-export-filter/class-export-filter.php
Outdated
Show resolved
Hide resolved
Code Coverage SummaryCoverage changed in 25 files. Only the first 5 are listed here.
1 file is newly checked for coverage.
|
|
The code "works" now. It means that I don't think this can be a problem on WoA sites. I'm going to port this change to wpcom so it limits the impact a tad. |
Except WoA sites will not necessarily have been WoA sites during the period when these posts were being created. If we want to eliminate that problem then the solution needs to work on both WoA and Simple sites. |
This code doesn't work, and even if it did it's a hack that is probably overly broad. In the past a fallback navigation was created when one didn't exist. This used to be created in `render_block_core_navigation` (see WordPress/gutenberg#47684) but no longer happens. This has lead to a number of blogs having this fallback navigation created when an a12 visited their site. This means that the a12s details - wpcom username, email address etc are in the generated export file. It feels unhelpful to have this in the export file, partly because it might need explaining if the legal time share an export as part of a legal request, and partly because a12s data is shared to anyone exporting an affected blog. This change tries to remove the information when generating the export file, but is somewhat hampered by a lack of relevent hooks and also by a way of identifying current and former a12s. Instead it chooses a relevant SQL generation, hackily identifies whether the query is the relevant part of the export, and then modifies it to add some SQL excluding wp_navigation post types that were created by users that are not currently part of the blog. See also https://linear.app/a8c/issue/DOTCOM-13020
Rather than using a subquery, rely on get_users to get the 'allowed' users.
79df023 to
bd2971b
Compare
To alleviate that concern, would it be possible to change the author of the |
That's a great suggestion, seems better than just omitting it. We discussed the more focused targetting previously, the concern there is that we don't enforce that a12 wpcom accounts use an automattic email. Including the personal email is arguably less good than including an a8c email. |
To mitigate your concern, could we continue to export the Oh, sorry, I didn't continue reading the discussion haha. Miguel basically suggested the same thing. |
| \Automattic\Jetpack\Jetpack_Mu_Wpcom\Holiday_Snow::init(); | ||
|
|
||
| // Initialize WPCOM Navigation Export Filter | ||
| if ( class_exists( '\Automattic\Jetpack\Jetpack_Mu_Wpcom\Wpcom_Navigation_Export_Filter\Export_Filter' ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class is defined in Jetpack; this caller is also in Jetpack. I'm failing to understand under what circumstances the class won't exist while having Jetpack loaded.
I also think that instantiating an object and not storing its reference indicates that something isn't right here. Calling a static method also isn't the best solution since static methods aren't mockable. Maybe we could change the __construct to an init method or something similar? Since most of this file uses static ::init, I would also consider it valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class is defined in Jetpack; this caller is also in Jetpack. I'm failing to understand under what circumstances the class won't exist while having Jetpack loaded.
I'm not a PHP ninja, but if the require_once on line 287 succeeds, I don't see how this could fail.
...ackages/jetpack-mu-wpcom/src/features/wpcom-navigation-export-filter/class-export-filter.php
Show resolved
Hide resolved
| * | ||
| * @return void | ||
| */ | ||
| public function stop_export_filtering(): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need the is_exporting state. From what I can see, the code that runs in prod (not test) never changes this state after initialization, and the state is always going to be is_exporting = true. Am I missing something here? Aren't we always running the transformation for all queries once the class is instantiated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely don't need the class var. We don't really need the method I guess. It kinda feels we should have it for symmetry, and in case we start exporting elsewhere but yagni, i'll remove it.
| AND {$wpdb->posts}.post_author > 0 | ||
| AND {$wpdb->posts}.post_author NOT IN ({$current_users}) | ||
| )"; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here is a personal preference, so feel free to ignore this comment completely if you want.
I think it's a bit simpler to follow if we merge both conditions, something like this resonates with me:
// Fetch current users (IDs)
$users = get_users( [ 'fields' => 'ID' ] );
// Build a comma-list of ints (or an empty string if no users)
$current_users = $users ? implode( ',', array_map( 'intval', $users ) ) : '';
// Now one assignment, tacking on the “NOT IN” bit only when $current_users isn’t empty
$additional_where = "
AND NOT (
{$wpdb->posts}.post_type = 'wp_navigation'
AND {$wpdb->posts}.post_author > 0" .
( $current_users ? " AND {$wpdb->posts}.post_author NOT IN ({$current_users_list})" : '' ) . "
)";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested the changes and they work as described. If we change to including the removed posts under an existing user, I'll re-review the changes.
A constraint here is that there are no good places to hook, so the modifications have to be in SQL land. The SQL query to do that is more complicated than the current approach, so I'm more worried about its fragility. I'll finish it off and share it on this PR in a while. |
Can we do that outside of the export action? It seems to me that the DB is polluted to some extent due to a bug, and some posts have a wrong author. So rather than waiting for the export to fix that, can we run a script to change the author or do it during a normal hook like Something like the logic that exempts some sites from paying for GS styles if they used it before becoming a paid feature:
|
I remember that originally we discussed and discarded the idea of running a script over "every" site to fix the db, though I can't remember specifically why. Depolluting the DB seems like a fine idea though. We could do it on the |
alshakero
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just drive by comments.
| require_once __DIR__ . '/features/wpcom-post-list/wpcom-post-types-tracking.php'; | ||
| require_once __DIR__ . '/features/wpcom-widgets/wpcom-widgets.php'; | ||
| require_once __DIR__ . '/features/wpcom-wpadmin-page-view/wpcom-wpadmin-page-view.php'; | ||
| require_once __DIR__ . '/features/wpcom-navigation-export-filter/class-export-filter.php'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may create a few hundred fatals during the first deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not with a jetpack sun/moon deployment it shouldn't
avoid mid-deploy fatals due to files being removed before in-use files referencing them are updated, but sun/moon achieves the same end by only updating code when it’s not currently in use in the first place.
PCYsg-Jjm-p2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah good point!
| \Automattic\Jetpack\Jetpack_Mu_Wpcom\Holiday_Snow::init(); | ||
|
|
||
| // Initialize WPCOM Navigation Export Filter | ||
| if ( class_exists( '\Automattic\Jetpack\Jetpack_Mu_Wpcom\Wpcom_Navigation_Export_Filter\Export_Filter' ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class is defined in Jetpack; this caller is also in Jetpack. I'm failing to understand under what circumstances the class won't exist while having Jetpack loaded.
I'm not a PHP ninja, but if the require_once on line 287 succeeds, I don't see how this could fail.
| /** | ||
| * Test that export filter hooks are properly registered. | ||
| */ | ||
| public function test_export_filter_hooks_are_registered(): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reasonable to include wp-admin/includes/export.php as a text file and check if $post_ids = $wpdb->get_col( "SELECT ID FROM {$wpdb->posts} $join WHERE $where" ); string occurs in it?
We have no guarantee this query will remain the same and we'll never know if Core changes it.
|
Marking as abandoned for now - I'll investigate an alternative solution. |
Fixes https://linear.app/a8c/issue/DOTCOM-13020
Proposed changes:
Warning
This code is overly broad and a hack. Maybe that's ok. 😞
In the past a fallback navigation was created when one didn't exist. This used to be created in
render_block_core_navigation(see WordPress/gutenberg#47684) but no longer happens.This has lead to a number of blogs having this fallback navigation created when an a12 visited their site. This means that the a12s details including - wpcom username, email address, etc are in the generated export file.
It feels unhelpful to have this in the export file, partly because it might need explaining if the legal team share an export as part of their operating procedures and fail to remove the non-material data, and partly because a12s data is shared to anyone exporting an affected blog.
This change tries to remove the information when generating the export file, but is somewhat hampered by a lack of relevant hooks and also by a way of identifying current and former a12s. Instead it chooses a relevant SQL generation, hackily identifies whether the query is the relevant part of the export, and then modifies it to add some SQL excluding wp_navigation post types that were created by users that are not currently part of the blog.
Other information:
Jetpack product discussion
Does this pull request change what data or activity we track or use?
Testing instructions:
You should see that the wp_navigation CPT (and it's author) don't show up in the export, while the About page and it's author do.