Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async indexing #98

Draft
wants to merge 21 commits into
base: release/0.6
Choose a base branch
from
Draft

Async indexing #98

wants to merge 21 commits into from

Conversation

mboynes
Copy link
Contributor

@mboynes mboynes commented Mar 8, 2018

This PR replaces synchronous indexing with async indexing. It is a breaking change.

Closes #39, Refs #38, #42, #43

Update 2/2/2024

This is a stable branch in good shape. It was paused due to the Gutenberg issue. I lost touch with that issue, but it's been a long time and perhaps there's a workaround at this point.

I have another note about using DISTINCT post_id that should be verified and applied.

Otherwise, this should get a fresh set of eyes and let's ship it as isolated as possible in its own release (ideally 0.6).

# Conflicts:
#	lib/class-sp-post.php
#	lib/class-sp-sync-manager.php
# Conflicts:
#	lib/class-sp-post.php
#	lib/class-sp-sync-manager.php
@kjbenk
Copy link
Contributor

kjbenk commented Mar 13, 2018

phpcs errors are being fixed in this PR #99 Once that is merged in we can update this PR

@mboynes
Copy link
Contributor Author

mboynes commented Jun 14, 2019

This is affected, and at least somewhat blocked, by WordPress/gutenberg#12903. If this PR is merged as-is, there will potentially be a very small window for a race condition where,

  1. Gutenberg makes first save request, triggering save_post
  2. SearchPress sets the sync flag on the post and schedules the async job
  3. Cron fires, starting the async job (not all post data has been saved at this point)
  4. Gutenberg makes its second save request with post meta
  5. SearchPress sets the sync flag on the post again, though it's already there, and schedules the cron job
  6. Cron job in (3) completes and removes the sync flag from the post. Post in ES is now missing data.

There are ways we could work around this, e.g. by setting the value on the sync flag to 1 or 2 depending on if the Gutenberg request is a first save or a second, then when removing the flag, use the 3rd param in delete_post_meta to ensure that a second request would be treated separately if it came to it. Pretty hacky, but I think it would work.

# Conflicts:
#	.travis.yml
#	lib/class-sp-post.php
#	lib/class-sp-sync-manager.php
#	tests/test-api.php
#	tests/test-faceting.php
#	tests/test-general.php
#	tests/test-indexing.php
#	tests/test-integration.php
#	tests/test-mapping-postmeta.php
#	tests/test-mapping.php
#	tests/test-searching.php
}
$to_delete[] = absint( $post_id );

update_option( 'sp_delete', array_unique( $to_delete ), 'no' );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential overwrite race condition here. Need to figure out a better way to store this data if we're to proceed with this route. Maybe deletes should be synchronous, though?

@srtfisher srtfisher changed the base branch from master to release/0.5 December 1, 2020 18:53
@srtfisher
Copy link
Member

Added in sp_should_index_async to allow indexing to be opted out of.

$sync_meta = SP_Sync_Meta();

// phpcs:disable WordPress.DB.DirectDatabaseQuery
$post_ids = $wpdb->get_col( "SELECT SQL_CALC_FOUND_ROWS `post_id` FROM {$wpdb->postmeta} WHERE `meta_key`='_sp_index' LIMIT 500" ); // WPCS: cache ok.
Copy link
Contributor Author

@mboynes mboynes Dec 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be DISTINCT post_id in retrospect.

mslinnea and others added 2 commits December 16, 2020 09:55
@mboynes mboynes deleted the branch release/0.6 February 2, 2024 21:29
@mboynes mboynes closed this Feb 2, 2024
@mboynes mboynes reopened this Feb 2, 2024
@mboynes mboynes changed the base branch from release/0.5 to release/0.6 February 2, 2024 21:41
@mboynes mboynes marked this pull request as draft February 2, 2024 21:42
# Conflicts:
#	lib/class-sp-sync-manager.php
#	lib/class-sp-sync-meta.php
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants