Skip to content

fix: controller cache-sync readiness check#7430

Merged
jukie merged 3 commits intoenvoyproxy:mainfrom
jukie:readiness-check-tweak
Feb 1, 2026
Merged

fix: controller cache-sync readiness check#7430
jukie merged 3 commits intoenvoyproxy:mainfrom
jukie:readiness-check-tweak

Conversation

@jukie
Copy link
Copy Markdown
Contributor

@jukie jukie commented Nov 4, 2025

What this PR does / why we need it:

  • Adds a more enhanced readiness check vs a healthz ping so that caches are synced before marking controller pods as ready

Which issue(s) this PR fixes:

Fixes #

Release Notes: Yes

@codecov
Copy link
Copy Markdown

codecov bot commented Nov 4, 2025

Codecov Report

❌ Patch coverage is 12.50000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.69%. Comparing base (ad35276) to head (71d0de9).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
internal/provider/kubernetes/kubernetes.go 12.50% 5 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7430      +/-   ##
==========================================
- Coverage   73.73%   73.69%   -0.04%     
==========================================
  Files         240      240              
  Lines       36493    36500       +7     
==========================================
- Hits        26907    26898       -9     
- Misses       7684     7696      +12     
- Partials     1902     1906       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jukie jukie marked this pull request as ready for review November 4, 2025 17:38
@jukie jukie requested a review from a team as a code owner November 4, 2025 17:38
@jukie jukie requested a review from a team November 4, 2025 17:40
@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Nov 4, 2025

hey can we get more info from the users to confirm this is seen during EG restarts, and for a short duration, then this fix would make sense

@jukie jukie force-pushed the readiness-check-tweak branch from c282dc0 to 2b51baa Compare November 4, 2025 22:28
@jukie jukie changed the title fix: wait for cache sync before marking controller pods ready fix: controller leader-election bug and cache-sync readiness Nov 4, 2025
@jukie jukie force-pushed the readiness-check-tweak branch from 2b51baa to 242815e Compare November 4, 2025 22:41
@jukie jukie marked this pull request as draft November 13, 2025 17:02
@jukie jukie closed this Dec 7, 2025
@jukie jukie reopened this Jan 28, 2026
@arkodg arkodg added this to the v1.7.0 Release milestone Jan 30, 2026
@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Jan 30, 2026

hey @jukie can we scope the fix to just waiting until cache is ready

@jukie jukie force-pushed the readiness-check-tweak branch from 242815e to 364e578 Compare January 31, 2026 04:03
@netlify
Copy link
Copy Markdown

netlify bot commented Jan 31, 2026

Deploy Preview for cerulean-figolla-1f9435 ready!

Name Link
🔨 Latest commit 71d0de9
🔍 Latest deploy log https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/697e9dd74f60c700084a4cd8
😎 Deploy Preview https://deploy-preview-7430--cerulean-figolla-1f9435.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@jukie jukie changed the title fix: controller leader-election bug and cache-sync readiness fix: controller cache-sync readiness check Jan 31, 2026
@jukie jukie marked this pull request as ready for review January 31, 2026 04:07
@jukie
Copy link
Copy Markdown
Contributor Author

jukie commented Jan 31, 2026

@arkodg I've updated it

arkodg
arkodg previously approved these changes Jan 31, 2026
Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Jan 31, 2026

seeing linter complain

Error: ../../../internal/provider/kubernetes/kubernetes.go:75:1: File is not properly formatted (gofumpt)

@arkodg arkodg requested review from a team January 31, 2026 21:41
zirain
zirain previously approved these changes Jan 31, 2026
Signed-off-by: jukie <[email protected]>
@jukie jukie dismissed stale reviews from zirain and arkodg via 1680da8 February 1, 2026 00:26
@jukie jukie force-pushed the readiness-check-tweak branch from 07995ad to 1680da8 Compare February 1, 2026 00:26
@jukie jukie requested a review from arkodg February 1, 2026 00:30
@jukie jukie requested a review from zirain February 1, 2026 00:30
@zirain
Copy link
Copy Markdown
Member

zirain commented Feb 1, 2026

@arkodg do we need to backport this?

@jukie jukie merged commit a5ade32 into envoyproxy:main Feb 1, 2026
35 of 36 checks passed
@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Feb 1, 2026

@arkodg do we need to backport this?

yeah added labels

@zirain
Copy link
Copy Markdown
Member

zirain commented Feb 1, 2026

@arkodg do we need to backport this?

yeah added labels

let me add it.

cnvergence pushed a commit to cnvergence/gateway that referenced this pull request Feb 3, 2026
cnvergence added a commit that referenced this pull request Feb 3, 2026
* e2e: speed tracing tests (#8124)

* e2e: speed tracing tests

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix(translator): allow single-label backends in host mode (#8123)

Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* ci: release json report (#8107)

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix oidc flakiness (#8119)

* fix oidc flakiness

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: skip_test_workflow doesn't exist (#8116)

This also uses grouped redirects to satisfy shellcheck SC2129.

Signed-off-by: Dylan M. Taylor <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix e2e test panic (#8109)

fix e2e test

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* chore: bump func-e to v1.4.0 (#8105)

bump func-e to v1.4.0

Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: route idle timeout (#8058)

* fix: route idle timeout

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* address comments

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* add test

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

---------

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* docs: add Mirakl to adopters list (#8138)

Signed-off-by: Thierry Wandja <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* docs: add security warning to control plane extensions (#7967)

chore(docs): add warnings about control plane extensions

Signed-off-by: Guy Daich <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* chore: add lint for release notes filenames (#8137)

* chore: add lint for release notes filenames

Signed-off-by: zirain <[email protected]>

* remove 1.7.0

Signed-off-by: zirain <[email protected]>

* fix lint

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: remove global logger in message package (#8131)

* fix: remove global logger in message package

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* docs: fix url result of regex rewrite (#7864)

* Update http-urlrewrite.md

Signed-off-by: Sadmi Bouhafs <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* chore: log skipped xds (#8132)

log skipped xds

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* docs: fixes for OPA sidecar + Unix Domain Socket task (#8142)

Signed-off-by: Matt Miller <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: basic auth validation (#8053)

* fix basic auth validation

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: controller cache-sync readiness check (#7430)

Signed-off-by: Karol Szwaj <[email protected]>

* fix: replace context.TODO with timeout context in config dump (#8122)

* fix: replace context.TODO with timeout context in config dump

Uses context.WithTimeout instead of context.TODO() to enable
proper cancellation and prevent indefinite hangs when Kubernetes
API is slow or unavailable.

Fixes #8121

Signed-off-by: jaffar <[email protected]>

* Make config dump timeout configurable with 30s default

- Add Timeout field to ConfigDump struct
- Add DefaultConfigDumpTimeout constant (30s)
- Add getTimeout() helper that returns configured timeout or default
- Update Collect() to use cd.getTimeout() instead of hardcoded value

Signed-off-by: jaffar <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* refactor: convert IR map fields to slices to ensure deterministic Dee… (#7953)

* refactor: convert IR map fields to slices to ensure deterministic DeepEqual

Addresses issue #7852.

Signed-off-by: Junnygram <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix links in releasing and develop docs (#8141)

* fix links in releasing and develop docs

Signed-off-by: Karol Szwaj <[email protected]>

* update quickstart link

Signed-off-by: Karol Szwaj <[email protected]>

---------

Signed-off-by: Karol Szwaj <[email protected]>

* docs: add provider guide for entra (#7977)

* docs: add provider guide for entra

Signed-off-by: Oliver Bähler <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* chore: clean up test output files (#8154)

clean up test output files

Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* fix: TCPRoute mTLS didn't work (#8152)

* fix: remove auto HTTP config on TCP cluster

Signed-off-by: zirain <[email protected]>

* fix lint

Signed-off-by: zirain <[email protected]>

* add e2e

Signed-off-by: zirain <[email protected]>

* fix e2e

Signed-off-by: zirain <[email protected]>

* fix comment

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix resource name

Signed-off-by: zirain <[email protected]>

* address Arko's comment

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>

* v1.7.0-rc2 release notes (#8163)

* v1.7.0-rc2 release notes

Signed-off-by: Karol Szwaj <[email protected]>

* fix the date

Signed-off-by: Karol Szwaj <[email protected]>

---------

Signed-off-by: Karol Szwaj <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: Karol Szwaj <[email protected]>
Signed-off-by: Adrian Cole <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Signed-off-by: Dylan M. Taylor <[email protected]>
Signed-off-by: Thierry Wandja <[email protected]>
Signed-off-by: Guy Daich <[email protected]>
Signed-off-by: Sadmi Bouhafs <[email protected]>
Signed-off-by: Matt Miller <[email protected]>
Signed-off-by: jaffar <[email protected]>
Signed-off-by: Junnygram <[email protected]>
Signed-off-by: Oliver Bähler <[email protected]>
Co-authored-by: zirain <[email protected]>
Co-authored-by: Adrian Cole <[email protected]>
Co-authored-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: Dylan M. Taylor <[email protected]>
Co-authored-by: Thierry Wandja <[email protected]>
Co-authored-by: Guy Daich <[email protected]>
Co-authored-by: Sadmi Bouhafs <[email protected]>
Co-authored-by: Matt Miller <[email protected]>
Co-authored-by: Isaac Wilson <[email protected]>
Co-authored-by: jaffar keikei <[email protected]>
Co-authored-by: Olaleye <[email protected]>
Co-authored-by: Oliver Bähler <[email protected]>
zirain pushed a commit to zirain/gateway that referenced this pull request Feb 10, 2026
zirain pushed a commit to zirain/gateway that referenced this pull request Feb 10, 2026
zirain pushed a commit to zirain/gateway that referenced this pull request Feb 10, 2026
zirain pushed a commit to zirain/gateway that referenced this pull request Feb 10, 2026
zirain added a commit that referenced this pull request Feb 11, 2026
* feat: Ignore ready and stats listener metrics in shutdown manager calculation (#7985)

* feat: Ignore ready and stats listener metrics in shutdown manager calculation

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* refactor

Signed-off-by: zirain <[email protected]>

* remove USE_SERVER_CONNECTIONS

Signed-off-by: zirain <[email protected]>

* address review comment

Signed-off-by: zirain <[email protected]>

* display the real value

Signed-off-by: zirain <[email protected]>

* comment for worker thread

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>

* fix: custom response should be put at the first of the filter chain (#8061)

* fix: custom response should be put before oauth2

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* move the custom response filter to first

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* add release note

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

---------

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* fix: route idle timeout (#8058)

* fix: route idle timeout

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* address comments

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* add test

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

---------

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* fix: remove global logger in message package (#8131)

* fix: remove global logger in message package

Signed-off-by: zirain <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

* fix: controller cache-sync readiness check (#7430)

Signed-off-by: zirain <[email protected]>

* release notes for v1.5.9 (#8219)

* release notes for v1.5.9

Signed-off-by: zirain <[email protected]>

* update

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>

* update VERSION

Signed-off-by: zirain <[email protected]>

* update release notes

Signed-off-by: zirain <[email protected]>

* update

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: Isaac Wilson <[email protected]>
zirain added a commit that referenced this pull request Feb 11, 2026
* fix(status): align BackendTLSPolicy ResolvedRefs reason with Gateway API (#7793)

* fix(status): align BackendTLSPolicy ResolvedRefs reason with Gateway API

Signed-off-by: Aditya7880900936 <[email protected]>

* fix(gatewayapi): use accurate error for missing CA bundle in BackendTLSPolicy

Signed-off-by: Aditya7880900936 <[email protected]>

* gatewayapi: fix BackendTLSPolicy status reasons for invalid CA refs

Signed-off-by: Aditya7880900936 <[email protected]>

* Update internal/gatewayapi/backendtlspolicy.go

Co-authored-by: Arko Dasgupta <[email protected]>
Signed-off-by: Aditya Sanskar Srivastav <[email protected]>

* gatewayapi: align BackendTLSPolicy invalid CA status and formatting

Signed-off-by: Aditya7880900936 <[email protected]>

* gatewayapi: align BackendTLSPolicy invalid CA error message with validation output

Signed-off-by: Aditya7880900936 <[email protected]>

* testdata: regenerate BackendTLSPolicy invalid CA output

Signed-off-by: Aditya7880900936 <[email protected]>

* fix(gatewayapi): keep Accepted reason as NoValidCACertificate for invalid CA ref kind

Signed-off-by: Aditya7880900936 <[email protected]>

* chore(gatewayapi): fix import grouping in BackendTLSPolicy

Signed-off-by: Aditya7880900936 <[email protected]>

---------

Signed-off-by: Aditya7880900936 <[email protected]>
Signed-off-by: Aditya Sanskar Srivastav <[email protected]>
Co-authored-by: Arko Dasgupta <[email protected]>

* feat: Ignore ready and stats listener metrics in shutdown manager calculation (#7985)

* feat: Ignore ready and stats listener metrics in shutdown manager calculation

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* refactor

Signed-off-by: zirain <[email protected]>

* remove USE_SERVER_CONNECTIONS

Signed-off-by: zirain <[email protected]>

* address review comment

Signed-off-by: zirain <[email protected]>

* display the real value

Signed-off-by: zirain <[email protected]>

* comment for worker thread

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>

* fix: custom response should be put at the first of the filter chain (#8061)

* fix: custom response should be put before oauth2

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* move the custom response filter to first

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* add release note

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

---------

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* fix: route idle timeout (#8058)

* fix: route idle timeout

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* address comments

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* add test

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

---------

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* fix: remove global logger in message package (#8131)

* fix: remove global logger in message package

Signed-off-by: zirain <[email protected]>

* fix: TCPRoute mTLS didn't work (#8152)

* fix: remove auto HTTP config on TCP cluster

Signed-off-by: zirain <[email protected]>

* fix lint

Signed-off-by: zirain <[email protected]>

* add e2e

Signed-off-by: zirain <[email protected]>

* fix e2e

Signed-off-by: zirain <[email protected]>

* fix comment

Signed-off-by: zirain <[email protected]>

* fix

Signed-off-by: zirain <[email protected]>

* fix resource name

Signed-off-by: zirain <[email protected]>

* address Arko's comment

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>

* fix: continue processing the remaining xDS with invalid EnvoyPatchPolicies (#8153)

continue processing the remaining xDS with invalid EnvoyPatchPolicies

Signed-off-by: Huabing (Robin) Zhao <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

* fix: controller cache-sync readiness check (#7430)

Signed-off-by: zirain <[email protected]>

* fix gen

Signed-off-by: zirain <[email protected]>

* release notes for v1.6.4 (#8221)

* release notes for v1.6.4

Signed-off-by: zirain <[email protected]>

* update

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: zirain <[email protected]>

* update VERSION

Signed-off-by: zirain <[email protected]>

* update release notes

Signed-off-by: zirain <[email protected]>

* update

Signed-off-by: zirain <[email protected]>

---------

Signed-off-by: Aditya7880900936 <[email protected]>
Signed-off-by: Aditya Sanskar Srivastav <[email protected]>
Signed-off-by: zirain <[email protected]>
Signed-off-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: Aditya Sanskar Srivastav <[email protected]>
Co-authored-by: Arko Dasgupta <[email protected]>
Co-authored-by: Huabing (Robin) Zhao <[email protected]>
Co-authored-by: Isaac Wilson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants