Skip to content

Conversation

@ramonjd
Copy link
Member

@ramonjd ramonjd commented Oct 15, 2025

@ramonjd ramonjd force-pushed the try/add-id-for-deterministic-ordering-to-prevent-duplicate-records branch from 8f83b7e to 85339ff Compare October 15, 2025 02:30
@ramonjd ramonjd self-assigned this Oct 15, 2025
@ramonjd ramonjd changed the title Modifies the resolve_pattern_blocks function to include metadata fo… WP_Query: force deterministic ordering Oct 15, 2025
$orderby = '';
} else {
$orderby_array = array();
/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterwilsoncc When you get a spare moment (can be after 6.9 or whenever you have head space) could you sanity check this approach for me?

The TL;DR is:

When multiple posts have identical values for the primary sort field (like post_date, post_title, menu_order), the database doesn't guarantee consistent ordering across pagination.

This causes inconsistent pagination results, mainly in the form of dupes.

The solution here (and in all the other attempts from 6 years ago) has been to automatically add ID as a secondary sort field when ordering by fields that can have duplicate values. This ensures records with identical primary sort values always appear in the same order.

@github-actions
Copy link

github-actions bot commented Oct 15, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props ramonopoly, peterwilsoncc, azaozz.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

Comment on lines +3314 to +3324
$new_request = $this->request;
// Split SQL into parts.
$parts = explode( 'ORDER BY', $new_request );
if ( count( $parts ) === 2 ) {
// Replace only in the SELECT part, preserve ORDER BY.
$select_part = str_replace( $fields, "{$wpdb->posts}.*", $parts[0] );
$new_request = $select_part . 'ORDER BY' . $parts[1];
} else {
// No ORDER BY clause, safe to replace.
$new_request = str_replace( $fields, "{$wpdb->posts}.*", $new_request );
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current str_replace is too greedy in that it will replace strings in the ORDER BY clause as well.

@ramonjd ramonjd force-pushed the try/add-id-for-deterministic-ordering-to-prevent-duplicate-records branch from 9c77315 to 03ff908 Compare October 22, 2025 06:30
* When multiple posts have the same value for a field, add ID as secondary sort to guarantee consistent ordering.
* Note: this is to circumvent a bug that is currently being tracked in https://core.trac.wordpress.org/ticket/44349.
*/
$fields_requiring_deterministic_orderby = array(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta data (both values and keys) can be identical too. That will be a little messy because orderby also accepts the array keys in $meta_query.

The tickets are very long, so I'm wondering if it was considered to add the ID to the ordering if it wasn't included in the parameter already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I'm wondering if it was considered to add the ID to the ordering if it wasn't included in the parameter already.

Thanks, do you mean just include ID by default?

The whitelist is a bit janky and will add maintenance burden. I just did it as a very naive, first version to get a discussion going.

A previous, much older patch used a black list 'ID', 'rand', 'relevance'.

Not sure it would address the maintenance burden side of things, but it'd be more permissive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it would address the maintenance burden side of things

Yea, blacklist is better imho too. It will address the maintenance burden better considering that new "columns" are added to the query once per several blue moons :)

…D as a secondary sort field. This change addresses potential duplicate records across pages when multiple posts share the same value for a field. A list of fields requiring deterministic ordering has been introduced to improve query consistency
…nding ID as a secondary sort field. Update related unit tests to reflect changes in expected SQL output for various orderby scenarios
…and ensure ID is consistently used as a tie-breaker. Update unit tests to reflect changes in expected SQL output for various orderby cases.
…secondary sort field in various scenarios. Update unit tests to reflect changes in expected SQL output for orderby cases, enhancing determinism in query results.
…or consistent cache key generation. This change ensures deterministic results when 'date' is specified as the orderby value, improving query consistency.
@ramonjd ramonjd force-pushed the try/add-id-for-deterministic-ordering-to-prevent-duplicate-records branch from a204b0d to d7b7c74 Compare December 5, 2025 01:26
…nt post creation. Updated post_date to use str_pad for zero-padding single-digit days.
@ramonjd
Copy link
Member Author

ramonjd commented Dec 5, 2025

I'm going to try to revive this PR with the aim to get it in for 7.0. I'll respond to @peterwilsoncc's feedback first.

I'm a bit short on time so if any folks have time to test this approach or help with alternatives, that'd be awesome!

…post creation. Introduced separate arrays for posts with identical dates, titles, and menu orders to enhance test clarity and maintainability. Updated queries to reference these shared fixtures, ensuring consistent results across pagination and ordering scenarios.
…r deterministic ordering. Updated comments for clarity and introduced a new test for metadata ordering to ensure no duplicates across paginated results.
Introduced new tests to verify deterministic ordering when posts are ordered by search relevance. Created shared fixtures for posts with identical content to ensure consistent relevance scores, preventing duplicates across paginated results. Updated the test suite to include scenarios for both explicit and empty orderby parameters, ensuring robust coverage of search-related ordering behavior.
…Included 'post__in', 'post_name__in', 'post_parent__in', and 'include' to improve query flexibility and support additional ordering scenarios.
// See get_pages(): when sort_column is 'none', the get_pages() function should not generate any ORDER BY clause.
// Should it rather be handled in the get_pages() function?
// src/wp-includes/post.php L6496
} elseif ( 'none' === $query_vars['orderby'] || ( is_array( $query_vars['orderby'] ) && array_key_exists( 'none', $query_vars['orderby'] ) ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick :) Thought I saw somewhere that isset( array['key'] ) is tiny bit faster/preferable to is_array() followed by array_key_exists(). Maybe the ( is_array( $query_vars['orderby'] ) && array_key_exists( 'none', $query_vars['orderby'] ) ) could be isset( $query_vars['orderby']['none'] )?

sort( $args['post_status'] );
}

// Add a default orderby value of date to ensure same cache key generation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nitpick :) This comment still seems okay imho. No need to delete it, just edit it?

@ramonjd
Copy link
Member Author

ramonjd commented Dec 11, 2025

I was mulling over this today, and given the experience in https://make.wordpress.org/core/2025/12/10/adjacent-post-navigation-changes-in-wordpress-6-9-and-compatibility-issues/, I was wondering if the scope here could be narrowed further to only paginated queries.

I'm not sure there's a remedy for filters receiving different values with this PR.

Filter Risk Level Affected Queries Common Use Cases
posts_orderby HIGH All paginated queries General ORDER BY modifications
posts_clauses HIGH All paginated queries Complex query modifications
posts_orderby_request MEDIUM All paginated queries Caching plugins
posts_clauses_request MEDIUM All paginated queries Caching plugins
posts_search_orderby LOW Search queries only Search relevance ordering

Possibly deterministic ordering could be opt-in to begin with to give folks time to update, or just make it opt-out from the beginning.

@peterwilsoncc
Copy link
Contributor

I was wondering if the scope here could be narrowed further to only paginated queries.

I don't think that's possible as we can't know if a query will be paginated before making the query (eg the blogs home page is likely paginated but doesn't have a page number/offset).

There's quite a few plugins with large numbers of installs using those filters, so yeah, let's take a look at these https://wpdirectory.net/search/01KC7TKAN2FGBHSK1NC321HF63

Now there is native caching in WP Core for WP_Query, I think caching plugins should be fine but we'll need to validate it. Some hosts may still be using their own.

@ramonjd
Copy link
Member Author

ramonjd commented Dec 11, 2025

Thanks, Peter.

Another thought bubble I had was pursuing a fix indirectly. Rather than forcing deterministic ordering, make it an option to sort by ID AND whatever else you want.

Wondering if this trac ticket is relevant.

Edit:

an option to sort by ID AND whatever else you want

LOL you can do that anyway. Me dumb.

And that wasn't the trac ticket I was looking for. I think there's one flying about. Maybe this one Should REST API support multiple orderby values?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants