Skip to content

fix: handle context errors as transient errors#6850

Merged
guydc merged 12 commits intoenvoyproxy:mainfrom
TomerJLevy:patch-2
Aug 28, 2025
Merged

fix: handle context errors as transient errors#6850
guydc merged 12 commits intoenvoyproxy:mainfrom
TomerJLevy:patch-2

Conversation

@TomerJLevy
Copy link
Copy Markdown
Contributor

What type of PR is this?
fix(provider): handle context errors as transient

What this PR does / why we need it:

Which issue(s) this PR fixes:
This PR updates the error handling logic so that context cancelled and deadline exceeded are classified as transient errors.
Currently, these errors are not treated as transient; This can result in unintended behaviors such as Envoy-Proxy deployment recreation.
By treating these errors as transient, the system will retry or recover gracefully instead of propagating incorrect state.

Fixes #6849

Release Notes: Yes
Handle context cancelled and deadline exceeded as transient errors to prevent incorrect state reconciliation and unintended behaviors.

@TomerJLevy TomerJLevy requested a review from a team as a code owner August 26, 2025 14:12
Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: TomerJLevy <[email protected]>
@codecov
Copy link
Copy Markdown

codecov bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.08%. Comparing base (933b582) to head (807bc4d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6850      +/-   ##
==========================================
+ Coverage   71.06%   71.08%   +0.01%     
==========================================
  Files         225      225              
  Lines       39854    39856       +2     
==========================================
+ Hits        28323    28331       +8     
+ Misses       9861     9855       -6     
  Partials     1670     1670              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@guydc
Copy link
Copy Markdown
Contributor

guydc commented Aug 26, 2025

/retest

@kkk777-7
Copy link
Copy Markdown
Member

@zirain
Copy link
Copy Markdown
Member

zirain commented Aug 26, 2025

@TomerJLevy can you add a release note?

zhaohuabing
zhaohuabing previously approved these changes Aug 27, 2025
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the quick fix!

@TomerJLevy TomerJLevy changed the title Handle context errors as transient errors fix: handle context errors as transient errors Aug 27, 2025
Signed-off-by: TomerJLevy <[email protected]>
@TomerJLevy
Copy link
Copy Markdown
Contributor Author

@TomerJLevy can you add a release note?

Thanks. I added release notes.
I also returned the error as-is, without any wrapping, so it can be caught as a transient error.
@zhaohuabing and @kkk777-7 let me know what you think.

@zirain
Copy link
Copy Markdown
Member

zirain commented Aug 27, 2025

@TomerJLevy can you make CI happy?

Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: TomerJLevy <[email protected]>
@kkk777-7
Copy link
Copy Markdown
Member

LGTM, thanks!

zirain
zirain previously approved these changes Aug 27, 2025
zhaohuabing
zhaohuabing previously approved these changes Aug 27, 2025
@TomerJLevy
Copy link
Copy Markdown
Contributor Author

/retest

@zirain zirain enabled auto-merge (squash) August 28, 2025 07:14
@zirain
Copy link
Copy Markdown
Member

zirain commented Aug 28, 2025

@TomerJLevy can you fix the conflict?

auto-merge was automatically disabled August 28, 2025 13:10

Head branch was pushed to by a user without write access

@TomerJLevy TomerJLevy dismissed stale reviews from zhaohuabing and zirain via 807bc4d August 28, 2025 13:10
@TomerJLevy
Copy link
Copy Markdown
Contributor Author

/retest

@guydc guydc merged commit 95cface into envoyproxy:main Aug 28, 2025
76 of 87 checks passed
shawnh2 pushed a commit to shawnh2/gateway that referenced this pull request Sep 15, 2025
* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: shawnh2 <[email protected]>
shawnh2 pushed a commit to shawnh2/gateway that referenced this pull request Sep 15, 2025
* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: shawnh2 <[email protected]>
zirain added a commit to zirain/gateway that referenced this pull request Sep 16, 2025
* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: zirain <[email protected]>
zirain added a commit to zirain/gateway that referenced this pull request Sep 16, 2025
* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: zirain <[email protected]>
arkodg added a commit that referenced this pull request Sep 16, 2025
* fix: handle context errors as transient errors (#6850)

* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* chore: fix CVE (#6903)

Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* fix: rm incorrectly set exclusiveMaximum field in CRD (#6926)

* fix: rm incorrectly set exclusiveMaximum field in CRD

* Also fix maximum value to 599 which includes 599 as a valid num

Fixes: #6925

Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* fix: validation for grpc routes with extension ref filters (#6949)

Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* fix: cleanup dangling route status conditions (#6812)

Signed-off-by: y-rabie <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* Fix: Add missing patch annotations to Compression struct for proper Merge (#6951)

* fix: merge compression annotation

Signed-off-by: sudipto baral <[email protected]>

* test: add more compression merge test cases

Signed-off-by: sudipto baral <[email protected]>

---------

Signed-off-by: sudipto baral <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* fix: update distroless image to resolve glibc CVEs (#6953)

Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: shawnh2 <[email protected]>

* fix gen-check

Signed-off-by: shawnh2 <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: shawnh2 <[email protected]>
Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: y-rabie <[email protected]>
Signed-off-by: sudipto baral <[email protected]>
Co-authored-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Co-authored-by: shahar-h <[email protected]>
Co-authored-by: Arko Dasgupta <[email protected]>
Co-authored-by: Rudrakh Panigrahi <[email protected]>
Co-authored-by: Youssef Rabie <[email protected]>
Co-authored-by: Sudipto Baral <[email protected]>
zirain added a commit that referenced this pull request Sep 16, 2025
* fix: cluster stat name: lowercase Kind (#6780)

cluster stat name: lowercase Kind

Signed-off-by: Guy Daich <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: envoy service cluster name for zone-aware routing (#6763)

* fix!: fix envoy service cluster name for zone-aware routing

Signed-off-by: y-rabie <[email protected]>

* extend e2e tests for zone aware routing

Signed-off-by: y-rabie <[email protected]>

* extend unit tests for zone aware routing

Signed-off-by: y-rabie <[email protected]>

---------

Signed-off-by: y-rabie <[email protected]>
Signed-off-by: zirain <[email protected]>

* conformance: update experimental test report (#6782)

* conformance: update experimental test report

Signed-off-by: zirain <[email protected]>

* fix version

Signed-off-by: zirain <[email protected]>

* fix(api): image validation regex, support port in repository (#6819)

fix: match repository in image with port

Signed-off-by: Windfarer <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: Actually update xdsIR with maxAcceptPerSocketEvent (#6834)

* Actually update xdsIR with maxAcceptPerSocketEvent

Signed-off-by: jukie <[email protected]>

* release note

Signed-off-by: jukie <[email protected]>

* newline lint

Signed-off-by: jukie <[email protected]>

---------

Signed-off-by: jukie <[email protected]>
Signed-off-by: zirain <[email protected]>

* bugfix: fix the topologyInjectorDisabled and the local cluster was not defined (#6847)

* bugfix: fix the topologyInjectorDisabled and the local cluster was not defined.

Signed-off-by: qicz <[email protected]>

* fix ut

Signed-off-by: qicz <[email protected]>

* add topology-injector-enabled ut

Signed-off-by: qicz <[email protected]>

* add release note

Signed-off-by: qi <[email protected]>

---------

Signed-off-by: qicz <[email protected]>
Signed-off-by: qi <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix(logging): correct log formatting to avoid DPANIC in controller-runtime logger (#6846)

* Update filters.go

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: handle context errors as transient errors (#6850)

* handle context errors as transient errors

Signed-off-by: TomerJLevy <[email protected]>

* add test cases

Signed-off-by: TomerJLevy <[email protected]>

* no need the new line

Signed-off-by: TomerJLevy <[email protected]>

* add release notes

Signed-off-by: TomerJLevy <[email protected]>

* Return the error as is

Signed-off-by: TomerJLevy <[email protected]>

* revert redundant changes

Signed-off-by: TomerJLevy <[email protected]>

* revert unrelated changes

Signed-off-by: TomerJLevy <[email protected]>

* revert more changes...

Signed-off-by: TomerJLevy <[email protected]>

---------

Signed-off-by: TomerJLevy <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: zirain <[email protected]>

* bugfix: the controller cannot read the EnvoyProxy attached gatewayclass only. (#6838)

Signed-off-by: zirain <[email protected]>

* chore: fix CVE (#6903)

Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: nil pointer dereference in btp configmap indexer (#6921)

Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: zirain <[email protected]>

* improve targetRef selection for targetSelectors (#6917)

* improve targetRef selection for targetSelectors

* only select refs in the same namespace as the policy

Signed-off-by: Arko Dasgupta <[email protected]>

* fix lint

Signed-off-by: Arko Dasgupta <[email protected]>

---------

Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: suppress lua validation logs (#6929)

Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: rm incorrectly set exclusiveMaximum field in CRD (#6926)

* fix: rm incorrectly set exclusiveMaximum field in CRD

* Also fix maximum value to 599 which includes 599 as a valid num

Fixes: #6925

Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: rm Strict SameSite default (#6941)

* by default it should be unset which implies `Lax`

Relates to #6347

Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: zirain <[email protected]>

* Optimize pod cache (#6936)

* Optimize pod cache

Signed-off-by: jukie <[email protected]>

* release note

Signed-off-by: jukie <[email protected]>

* Remove retry

Signed-off-by: jukie <[email protected]>

* cleanup

Signed-off-by: jukie <[email protected]>

---------

Signed-off-by: jukie <[email protected]>
Signed-off-by: Isaac <[email protected]>
Co-authored-by: zirain <[email protected]>
Signed-off-by: zirain <[email protected]>

* reduce DeepCopy in gateway-api layer (#6940)

* reduce deep copy in gateway-api layer

* also fixed the DeepCopy implementation for ControllerResources
which was performing a Shallow Copy resulting it lack of isolation
b/w provider and gateway-api layer

Relates to #6919

Signed-off-by: Arko Dasgupta <[email protected]>

* fix lint

Signed-off-by: Arko Dasgupta <[email protected]>

---------

Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: validation for grpc routes with extension ref filters (#6949)

Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: cleanup dangling route status conditions (#6812)

Signed-off-by: y-rabie <[email protected]>
Signed-off-by: zirain <[email protected]>

* Fix: Add missing patch annotations to Compression struct for proper Merge (#6951)

* fix: merge compression annotation

Signed-off-by: sudipto baral <[email protected]>

* test: add more compression merge test cases

Signed-off-by: sudipto baral <[email protected]>

---------

Signed-off-by: sudipto baral <[email protected]>
Signed-off-by: zirain <[email protected]>

* fix: update distroless image to resolve glibc CVEs (#6953)

Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: zirain <[email protected]>

* chore: bump golang to 1.24.7 (#6959)

chore: bump golang

Signed-off-by: zirain <[email protected]>

* fix: Make sure proxy protocol filter is the first listener filter (#6972)

Fixes: #6873

Signed-off-by: Arko Dasgupta <[email protected]>
Co-authored-by: Jacob Neil Taylor <[email protected]>
Signed-off-by: zirain <[email protected]>

* release notes

Signed-off-by: zirain <[email protected]>

* Removes reflection from RouteContext to reduce allocations (#6820)

* bench: adds APIToXDS bench & small opt

Signed-off-by: Takeshi Yoneda <[email protected]>

* no refect

goos: darwin
goarch: arm64
pkg: github.com/envoyproxy/gateway/test/gobench
cpu: Apple M1 Pro
                          │   old.txt    │               new.txt               │
                          │    sec/op    │   sec/op     vs base                │
GatewayAPItoXDS/small-10    881.2µ ±  7%   803.4µ ± 1%   -8.82% (p=0.000 n=10)
GatewayAPItoXDS/medium-10   4.130m ± 26%   2.959m ± 4%  -28.36% (p=0.000 n=10)
GatewayAPItoXDS/large-10     5.375 ±  2%    4.553 ± 1%  -15.28% (p=0.000 n=10)
geomean                     26.94m         22.12m       -17.90%

                          │   old.txt    │               new.txt                │
                          │     B/op     │     B/op      vs base                │
GatewayAPItoXDS/small-10    507.2Ki ± 0%   492.9Ki ± 0%   -2.83% (p=0.000 n=10)
GatewayAPItoXDS/medium-10   2.545Mi ± 7%   1.954Mi ± 2%  -23.21% (p=0.000 n=10)
GatewayAPItoXDS/large-10    2.832Gi ± 0%   2.831Gi ± 0%        ~ (p=0.529 n=10)
geomean                     15.40Mi        13.97Mi        -9.31%

                          │   old.txt   │               new.txt               │
                          │  allocs/op  │  allocs/op   vs base                │
GatewayAPItoXDS/small-10    8.328k ± 0%   8.017k ± 0%   -3.73% (p=0.000 n=10)
GatewayAPItoXDS/medium-10   39.45k ± 6%   29.74k ± 2%  -24.60% (p=0.000 n=10)
GatewayAPItoXDS/large-10    38.75M ± 0%   38.71M ± 0%   -0.11% (p=0.000 n=10)
geomean                     233.5k        209.8k       -10.16%

Signed-off-by: Takeshi Yoneda <[email protected]>

* removes garbage

Signed-off-by: Takeshi Yoneda <[email protected]>

* more tests

Signed-off-by: Takeshi Yoneda <[email protected]>

* more tests

Signed-off-by: Takeshi Yoneda <[email protected]>

---------

Signed-off-by: Takeshi Yoneda <[email protected]>

---------

Signed-off-by: Guy Daich <[email protected]>
Signed-off-by: zirain <[email protected]>
Signed-off-by: y-rabie <[email protected]>
Signed-off-by: Windfarer <[email protected]>
Signed-off-by: jukie <[email protected]>
Signed-off-by: qicz <[email protected]>
Signed-off-by: qi <[email protected]>
Signed-off-by: TomerJLevy <[email protected]>
Signed-off-by: Shahar Harari <[email protected]>
Signed-off-by: Rudrakh Panigrahi <[email protected]>
Signed-off-by: Arko Dasgupta <[email protected]>
Signed-off-by: Isaac <[email protected]>
Signed-off-by: sudipto baral <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Co-authored-by: Guy Daich <[email protected]>
Co-authored-by: Youssef Rabie <[email protected]>
Co-authored-by: Windfarer <[email protected]>
Co-authored-by: Isaac <[email protected]>
Co-authored-by: qi <[email protected]>
Co-authored-by: TomerJLevy <[email protected]>
Co-authored-by: shahar-h <[email protected]>
Co-authored-by: Rudrakh Panigrahi <[email protected]>
Co-authored-by: Arko Dasgupta <[email protected]>
Co-authored-by: Sudipto Baral <[email protected]>
Co-authored-by: Jacob Neil Taylor <[email protected]>
Co-authored-by: Takeshi Yoneda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context cancelled not treated as transient, causing unintended Envoy-Proxy recreation

7 participants