-
Notifications
You must be signed in to change notification settings - Fork 3.2k
WP_Query: force deterministic ordering #10262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
WP_Query: force deterministic ordering #10262
Conversation
8f83b7e to
85339ff
Compare
| $orderby = ''; | ||
| } else { | ||
| $orderby_array = array(); | ||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterwilsoncc When you get a spare moment (can be after 6.9 or whenever you have head space) could you sanity check this approach for me?
The TL;DR is:
When multiple posts have identical values for the primary sort field (like post_date, post_title, menu_order), the database doesn't guarantee consistent ordering across pagination.
This causes inconsistent pagination results, mainly in the form of dupes.
The solution here (and in all the other attempts from 6 years ago) has been to automatically add ID as a secondary sort field when ordering by fields that can have duplicate values. This ensures records with identical primary sort values always appear in the same order.
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
| $new_request = $this->request; | ||
| // Split SQL into parts. | ||
| $parts = explode( 'ORDER BY', $new_request ); | ||
| if ( count( $parts ) === 2 ) { | ||
| // Replace only in the SELECT part, preserve ORDER BY. | ||
| $select_part = str_replace( $fields, "{$wpdb->posts}.*", $parts[0] ); | ||
| $new_request = $select_part . 'ORDER BY' . $parts[1]; | ||
| } else { | ||
| // No ORDER BY clause, safe to replace. | ||
| $new_request = str_replace( $fields, "{$wpdb->posts}.*", $new_request ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current str_replace is too greedy in that it will replace strings in the ORDER BY clause as well.
9c77315 to
03ff908
Compare
src/wp-includes/class-wp-query.php
Outdated
| * When multiple posts have the same value for a field, add ID as secondary sort to guarantee consistent ordering. | ||
| * Note: this is to circumvent a bug that is currently being tracked in https://core.trac.wordpress.org/ticket/44349. | ||
| */ | ||
| $fields_requiring_deterministic_orderby = array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta data (both values and keys) can be identical too. That will be a little messy because orderby also accepts the array keys in $meta_query.
The tickets are very long, so I'm wondering if it was considered to add the ID to the ordering if it wasn't included in the parameter already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I'm wondering if it was considered to add the ID to the ordering if it wasn't included in the parameter already.
Thanks, do you mean just include ID by default?
The whitelist is a bit janky and will add maintenance burden. I just did it as a very naive, first version to get a discussion going.
A previous, much older patch used a black list 'ID', 'rand', 'relevance'.
Not sure it would address the maintenance burden side of things, but it'd be more permissive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it would address the maintenance burden side of things
Yea, blacklist is better imho too. It will address the maintenance burden better considering that new "columns" are added to the query once per several blue moons :)
…D as a secondary sort field. This change addresses potential duplicate records across pages when multiple posts share the same value for a field. A list of fields requiring deterministic ordering has been introduced to improve query consistency
…nding ID as a secondary sort field. Update related unit tests to reflect changes in expected SQL output for various orderby scenarios
…unit tests for orderby scenarios.
…and ensure ID is consistently used as a tie-breaker. Update unit tests to reflect changes in expected SQL output for various orderby cases.
…secondary sort field in various scenarios. Update unit tests to reflect changes in expected SQL output for orderby cases, enhancing determinism in query results.
…or consistent cache key generation. This change ensures deterministic results when 'date' is specified as the orderby value, improving query consistency.
a204b0d to
d7b7c74
Compare
…nt post creation. Updated post_date to use str_pad for zero-padding single-digit days.
|
I'm going to try to revive this PR with the aim to get it in for 7.0. I'll respond to @peterwilsoncc's feedback first. I'm a bit short on time so if any folks have time to test this approach or help with alternatives, that'd be awesome! |
…post creation. Introduced separate arrays for posts with identical dates, titles, and menu orders to enhance test clarity and maintainability. Updated queries to reference these shared fixtures, ensuring consistent results across pagination and ordering scenarios.
…r deterministic ordering. Updated comments for clarity and introduced a new test for metadata ordering to ensure no duplicates across paginated results.
Introduced new tests to verify deterministic ordering when posts are ordered by search relevance. Created shared fixtures for posts with identical content to ensure consistent relevance scores, preventing duplicates across paginated results. Updated the test suite to include scenarios for both explicit and empty orderby parameters, ensuring robust coverage of search-related ordering behavior.
…Included 'post__in', 'post_name__in', 'post_parent__in', and 'include' to improve query flexibility and support additional ordering scenarios.
| // See get_pages(): when sort_column is 'none', the get_pages() function should not generate any ORDER BY clause. | ||
| // Should it rather be handled in the get_pages() function? | ||
| // src/wp-includes/post.php L6496 | ||
| } elseif ( 'none' === $query_vars['orderby'] || ( is_array( $query_vars['orderby'] ) && array_key_exists( 'none', $query_vars['orderby'] ) ) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nitpick :) Thought I saw somewhere that isset( array['key'] ) is tiny bit faster/preferable to is_array() followed by array_key_exists(). Maybe the ( is_array( $query_vars['orderby'] ) && array_key_exists( 'none', $query_vars['orderby'] ) ) could be isset( $query_vars['orderby']['none'] )?
| sort( $args['post_status'] ); | ||
| } | ||
|
|
||
| // Add a default orderby value of date to ensure same cache key generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nitpick :) This comment still seems okay imho. No need to delete it, just edit it?
|
I was mulling over this today, and given the experience in https://make.wordpress.org/core/2025/12/10/adjacent-post-navigation-changes-in-wordpress-6-9-and-compatibility-issues/, I was wondering if the scope here could be narrowed further to only paginated queries. I'm not sure there's a remedy for filters receiving different values with this PR.
Possibly deterministic ordering could be opt-in to begin with to give folks time to update, or just make it opt-out from the beginning. |
I don't think that's possible as we can't know if a query will be paginated before making the query (eg the blogs home page is likely paginated but doesn't have a page number/offset). There's quite a few plugins with large numbers of installs using those filters, so yeah, let's take a look at these https://wpdirectory.net/search/01KC7TKAN2FGBHSK1NC321HF63 Now there is native caching in WP Core for WP_Query, I think caching plugins should be fine but we'll need to validate it. Some hosts may still be using their own. |
|
Thanks, Peter. Another thought bubble I had was pursuing a fix indirectly. Rather than forcing deterministic ordering, make it an option to sort by ID AND whatever else you want. Wondering if this trac ticket is relevant. Edit:
LOL you can do that anyway. Me dumb. And that wasn't the trac ticket I was looking for. I think there's one flying about. Maybe this one Should REST API support multiple orderby values? |
I'm testing out an approach to fix a long-standing issue:
https://core.trac.wordpress.org/ticket/47642
https://core.trac.wordpress.org/ticket/52907
https://core.trac.wordpress.org/ticket/64042
https://core.trac.wordpress.org/ticket/44349
https://core.trac.wordpress.org/ticket/46294
https://core.trac.wordpress.org/ticket/8107 (17 years! 🏅 )
Possibly related too:
https://core.trac.wordpress.org/ticket/46294
https://core.trac.wordpress.org/ticket/52626
Unit tests to come once I've validated the approach.
This change addresses potential duplicate records across pages when multiple posts share the same value for a field.
It updates WP_Query ordering to ensure deterministic results by adding ID as a secondary sort field.