Skip to content

Load FieldInfos from store if not yet initialised through a refresh on IndexShard#125650

Merged
original-brownbear merged 6 commits intoelastic:mainfrom
original-brownbear:125483
Mar 26, 2025
Merged

Load FieldInfos from store if not yet initialised through a refresh on IndexShard#125650
original-brownbear merged 6 commits intoelastic:mainfrom
original-brownbear:125483

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).

Closes #125483

If we don't yet have a cached value ready to go for the field infos,
lets load it from the directory.

Fixes #125383
@original-brownbear original-brownbear added >bug auto-backport Automatically create backport pull requests when merged :Search Foundations/Search Catch all for Search Foundations v9.0.0 v8.19.0 v9.1.0 labels Mar 26, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Mar 26, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @original-brownbear, I've created a changelog YAML for you.

Copy link
Copy Markdown
Member

@piergm piergm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @original-brownbear for working on this :D

try (Engine.Searcher hasValueSearcher = getEngine().acquireSearcher("field_has_value")) {
return FieldInfos.getMergedFieldInfos(hasValueSearcher.getIndexReader());
} catch (AlreadyClosedException ignore) {
// engine is closed - no update to3 FieldInfos is fine
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol right and in the process of fixing grammar/typo no less 🤣 thanks!

Copy link
Copy Markdown
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too :) Thanks for fixing this Armin


static {
try {
FIELD_INFOS = MethodHandles.lookup().findVarHandle(IndexShard.class, "fieldInfos", FieldInfos.class);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is a bit I don't like about VarHandle, and I'd normally prefer an Atomic... for the CAS operation for readability.

However, the tradeoff here is pretty sweet as we avoid creating (a potential alternative) one Atomic... instance for every IndexShard in order to wrap the fieldInfos giving us a sweet memory saving.

Nicely done ! :)

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks you two!

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
9.0
8.x

elasticsearchmachine pushed a commit that referenced this pull request Mar 26, 2025
…n IndexShard (#125650) (#125703)

Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).

Closes #125483
elasticsearchmachine pushed a commit that referenced this pull request Mar 26, 2025
…n IndexShard (#125650) (#125704)

Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).

Closes #125483
@javanna
Copy link
Copy Markdown
Contributor

javanna commented Mar 27, 2025

Thanks for fixing @original-brownbear I believe it makes sense to backport this to 8.18 as well.

@original-brownbear
Copy link
Copy Markdown
Contributor Author

incoming in #125762 :)

original-brownbear added a commit that referenced this pull request Mar 27, 2025
…n IndexShard (#125650) (#125762)

Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).

Closes #125483

// check that field_caps empty field filtering works as well
FieldCapabilitiesResponse response = client().prepareFieldCaps(mountedIndices).setFields("*").setincludeEmptyFields(false).get();
assertNotNull(response.getField("keyword"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to add the same test to the N-2 indices related tests for indices that are imported as verified read-only. We know the fix addresses the same issue for them too, but it'd be safer to have a specific test for them? These tests are under qa/lucene-index-compatibility .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ definitely. Let me try something, maybe I can get the full field caps suite running for searchable snapshots and there somehow, would be really nice to have coverage across all the implementations that play tricks on the engine :)

omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
…n IndexShard (elastic#125650)

Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).

Closes elastic#125483
tlrx added a commit that referenced this pull request Apr 15, 2026
…ndomIOExceptionsIT.testRandomDirectoryIOExceptions (#146240)

assertSearcherIsWarmedUp() asserts that the external reader has been warmed up before being accessed for searches, except for specific cases like gatherings segments stats or segments infos in which accessing a non-warmed up external reader are allowed. This assertion started to trip more frequently two weeks ago in the SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions, causing the test to be muted on the main branch.

The assertion fails when a random IO exception is thrown on purpose by the test during a refresh. Then the ExternalReaderManager is not warmed up and its isWarmedUp flag remains false. But Lucene's ReferenceManager always calls the afterRefresh() method on refresh listeners, and in the case of RefreshFieldHasValueListener it triggered loadFieldInfos() since fieldInfos was null. loadFieldInfos() than called acquireSearcher("field_has_value") before warm-up, failing the assertion.

In this change the condition fieldInfos == null is removed from the RefreshFieldHasValueListener.afterRefresh() method in order to avoid calling loadFieldInfos() on a failed refresh. It should be ok since getFieldInfos() already handles lazy loading when fieldInfos is null.

I also think we should relax the assertion in assertSearcherIsWarmedUp() to allow FIELD_HAS_VALUE_SOURCE, so that if something like the Field Capabilities API calls loadFieldInfos()/getFieldInfos() before warm-up the assertion won't fail. I would expect that accessing FieldInfos is safe before warm-up, similar to segment stats/infos.

Relates #118733

Note: the assertion also tripped for that same test soon after it was refactored in #125650, causing the same test to fail on 8.x branch (8.16, then 8.17 at the time) in a different manner documented here that I'll address in a follow up.
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Apr 15, 2026
…ndomIOExceptionsIT.testRandomDirectoryIOExceptions (elastic#146240)

assertSearcherIsWarmedUp() asserts that the external reader has been warmed up before being accessed for searches, except for specific cases like gatherings segments stats or segments infos in which accessing a non-warmed up external reader are allowed. This assertion started to trip more frequently two weeks ago in the SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions, causing the test to be muted on the main branch.

The assertion fails when a random IO exception is thrown on purpose by the test during a refresh. Then the ExternalReaderManager is not warmed up and its isWarmedUp flag remains false. But Lucene's ReferenceManager always calls the afterRefresh() method on refresh listeners, and in the case of RefreshFieldHasValueListener it triggered loadFieldInfos() since fieldInfos was null. loadFieldInfos() than called acquireSearcher("field_has_value") before warm-up, failing the assertion.

In this change the condition fieldInfos == null is removed from the RefreshFieldHasValueListener.afterRefresh() method in order to avoid calling loadFieldInfos() on a failed refresh. It should be ok since getFieldInfos() already handles lazy loading when fieldInfos is null.

I also think we should relax the assertion in assertSearcherIsWarmedUp() to allow FIELD_HAS_VALUE_SOURCE, so that if something like the Field Capabilities API calls loadFieldInfos()/getFieldInfos() before warm-up the assertion won't fail. I would expect that accessing FieldInfos is safe before warm-up, similar to segment stats/infos.

Relates elastic#118733

Note: the assertion also tripped for that same test soon after it was refactored in elastic#125650, causing the same test to fail on 8.x branch (8.16, then 8.17 at the time) in a different manner documented here that I'll address in a follow up.
# Conflicts:
#	muted-tests.yml
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Apr 15, 2026
…ndomIOExceptionsIT.testRandomDirectoryIOExceptions (elastic#146240)

Backport of elastic#146240 for 9.3

assertSearcherIsWarmedUp() asserts that the external reader has been warmed up before being accessed for searches, except for specific cases like gatherings segments stats or segments infos in which accessing a non-warmed up external reader are allowed. This assertion started to trip more frequently two weeks ago in the SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions, causing the test to be muted on the main branch.

The assertion fails when a random IO exception is thrown on purpose by the test during a refresh. Then the ExternalReaderManager is not warmed up and its isWarmedUp flag remains false. But Lucene's ReferenceManager always calls the afterRefresh() method on refresh listeners, and in the case of RefreshFieldHasValueListener it triggered loadFieldInfos() since fieldInfos was null. loadFieldInfos() than called acquireSearcher("field_has_value") before warm-up, failing the assertion.

In this change the condition fieldInfos == null is removed from the RefreshFieldHasValueListener.afterRefresh() method in order to avoid calling loadFieldInfos() on a failed refresh. It should be ok since getFieldInfos() already handles lazy loading when fieldInfos is null.

I also think we should relax the assertion in assertSearcherIsWarmedUp() to allow FIELD_HAS_VALUE_SOURCE, so that if something like the Field Capabilities API calls loadFieldInfos()/getFieldInfos() before warm-up the assertion won't fail. I would expect that accessing FieldInfos is safe before warm-up, similar to segment stats/infos.

Relates elastic#118733

Note: the assertion also tripped for that same test soon after it was refactored in elastic#125650, causing the same test to fail on 8.x branch (8.16, then 8.17 at the time) in a different manner documented here that I'll address in a follow up.

Also unmute elastic#114824
tlrx added a commit that referenced this pull request Apr 15, 2026
…ndomIOExceptionsIT.testRandomDirectoryIOExceptions (#146240) (#146344)

Backport of #146240 for 9.3

assertSearcherIsWarmedUp() asserts that the external reader has been warmed up before being accessed for searches, except for specific cases like gatherings segments stats or segments infos in which accessing a non-warmed up external reader are allowed. This assertion started to trip more frequently two weeks ago in the SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions, causing the test to be muted on the main branch.

The assertion fails when a random IO exception is thrown on purpose by the test during a refresh. Then the ExternalReaderManager is not warmed up and its isWarmedUp flag remains false. But Lucene's ReferenceManager always calls the afterRefresh() method on refresh listeners, and in the case of RefreshFieldHasValueListener it triggered loadFieldInfos() since fieldInfos was null. loadFieldInfos() than called acquireSearcher("field_has_value") before warm-up, failing the assertion.

In this change the condition fieldInfos == null is removed from the RefreshFieldHasValueListener.afterRefresh() method in order to avoid calling loadFieldInfos() on a failed refresh. It should be ok since getFieldInfos() already handles lazy loading when fieldInfos is null.

I also think we should relax the assertion in assertSearcherIsWarmedUp() to allow FIELD_HAS_VALUE_SOURCE, so that if something like the Field Capabilities API calls loadFieldInfos()/getFieldInfos() before warm-up the assertion won't fail. I would expect that accessing FieldInfos is safe before warm-up, similar to segment stats/infos.

Relates #118733

Note: the assertion also tripped for that same test soon after it was refactored in #125650, causing the same test to fail on 8.x branch (8.16, then 8.17 at the time) in a different manner documented here that I'll address in a follow up.

Also unmute #114824
tlrx added a commit that referenced this pull request Apr 15, 2026
…ndomIOExceptionsIT.testRandomDirectoryIOExceptions (#146240) (#146343)

Backport of #146240 for 9.4

assertSearcherIsWarmedUp() asserts that the external reader has been warmed up before being accessed for searches, except for specific cases like gatherings segments stats or segments infos in which accessing a non-warmed up external reader are allowed. This assertion started to trip more frequently two weeks ago in the SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions, causing the test to be muted on the main branch.

The assertion fails when a random IO exception is thrown on purpose by the test during a refresh. Then the ExternalReaderManager is not warmed up and its isWarmedUp flag remains false. But Lucene's ReferenceManager always calls the afterRefresh() method on refresh listeners, and in the case of RefreshFieldHasValueListener it triggered loadFieldInfos() since fieldInfos was null. loadFieldInfos() than called acquireSearcher("field_has_value") before warm-up, failing the assertion.

In this change the condition fieldInfos == null is removed from the RefreshFieldHasValueListener.afterRefresh() method in order to avoid calling loadFieldInfos() on a failed refresh. It should be ok since getFieldInfos() already handles lazy loading when fieldInfos is null.

I also think we should relax the assertion in assertSearcherIsWarmedUp() to allow FIELD_HAS_VALUE_SOURCE, so that if something like the Field Capabilities API calls loadFieldInfos()/getFieldInfos() before warm-up the assertion won't fail. I would expect that accessing FieldInfos is safe before warm-up, similar to segment stats/infos.

Relates #118733

Note: the assertion also tripped for that same test soon after it was refactored in #125650, causing the same test to fail on 8.x branch (8.16, then 8.17 at the time) in a different manner documented here that I'll address in a follow up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.18.1 v8.19.0 v9.0.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Field_caps] field_caps request don't return fields for indices with readonly_state when include_empty_fields=false

5 participants