Skip to content

feat(query): Use match_all for subquery matching#1356

Merged
orangejulius merged 1 commit intomasterfrom
no-subquery-for-population-and-popularity
Oct 3, 2019
Merged

feat(query): Use match_all for subquery matching#1356
orangejulius merged 1 commit intomasterfrom
no-subquery-for-population-and-popularity

Conversation

@orangejulius
Copy link
Copy Markdown
Member

This PR brings the search_pelias_parser queries into alignment with autocomplete and search queries by using the match_all query, instead of a phrase query, for use with function scoring to apply boosts for popularity and population.

While in theory this could result in longer query times because the population and popularity scoring will be applied to more documents, its unlikely we'll notice that.

However, what we would notice is if there's highly populated or popular record that does not match the phrase subquery, but does match the non-phrase query.

All that said, there are no changes to the acceptance tests with this change, so we can mostly consider it a pure refactoring that reduces complexity and technical debt.

Copy link
Copy Markdown
Member

@missinglink missinglink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like the simplicity of it.

I think this is a good candidate for a canary release against real-world data because it's more likely to have an effect on performance than result quality.

This PR brings the `search_pelias_parser` queries into alignment with
autocomplete and `search` queries by using the `match_all` query,
instead of a phrase query, for use with function scoring to apply boosts
for popularity and population.

While in theory this could result in longer query times because the
population and popularity scoring will be applied to more documents, its
unlikely we'll notice that.

However, what we _would_ notice is if there's highly populated or
popular record that does not match the phrase subquery, but does match
the non-phrase query.
@orangejulius orangejulius force-pushed the no-subquery-for-population-and-popularity branch from dbc6f34 to 7c2e98e Compare October 1, 2019 18:38
@orangejulius
Copy link
Copy Markdown
Member Author

After about two days of split testing, it looks like the performance changes due to this PR are minimal. There is a slight change to the average latency, but its probably within the margin of error for even a large number of queries.

Screenshot_2019-10-03_09-27-44

@orangejulius orangejulius merged commit f0a630a into master Oct 3, 2019
@orangejulius orangejulius deleted the no-subquery-for-population-and-popularity branch October 3, 2019 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants