Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immediate, infinite recursion when configuration loader is set to boot and reloading is disabled #1467

Closed
coderbydesign opened this issue Jun 7, 2024 · 2 comments

Comments

@coderbydesign
Copy link
Contributor

coderbydesign commented Jun 7, 2024

With the fix in #1399, it appears there was a regression introduced when using the following configuration, which is a use case where you'd want to load the configuration at boot, and never reload:

# load configuration at boot
APICAST_CONFIGURATION_LOADER: boot

# disable configuration refresh
APICAST_CONFIGURATION_CACHE:  -1

Prior to the fix for the above bug, the only way the configuration_loader executed the schedule method was if interval > 0 then, which would then refresh the configuration at whatever positive interval was defined in APICAST_CONFIGURATION_CACHE.

The changes introduced here make it so that even when you have refresh disabled via a negative number as documented here, this logic will always call the schedule method, and ultimately ngx.timer.at with a delay of -1 because the schedule method is recursive and uses the ttl, which will cause the timer to expire immediately (the same as a 0 value delay which is not an edge-case seen in APIcast since that's not valid with boot configuration) executing the callback immediately, and infinitely reloading the configuration.

This ultimately causes CPU and other resource consumption issues, and when done remotely has network and request implications trying to fetch the config.

Version
nginx version: openresty/1.21.4.3
quay.io/3scale/apicast-ci:openresty-1.21.4-1: 2024-04-29T20:39:47.030811444+10:00
Steps To Reproduce
  1. make development
  2. make dependencies
  3. Ensure this config [1] exists at /opt/app-root/src/config.json
  4. Run APICAST_LOG_LEVEL=info APICAST_WORKERS=1 THREESCALE_PORTAL_ENDPOINT=http://echo:8081/config/ THREESCALE_CONFIG_FILE=/opt/app-root/src/config.json APICAST_CONFIGURATION_LOADER=boot APICAST_CONFIGURATION_CACHE=-1 THREESCALE_DEPLOYMENT_ENV=staging APICAST_SERVICES_LIST=1234 ./bin/apicast
Current Result

You'll see that after the initial boot configuration loads, it then enters into the scheduler, and immediately and infinitely starts refreshing the configuration. Here is a sample of the output: https://gist.github.com/coderbydesign/f3d95680b27a34795c58e9906e784d71

Expected Result

We see the environment config loaded once at boot, with no schedule set for reloading configuration: https://gist.github.com/coderbydesign/0421cd55635007f3f45361eb6c35ee48

Additional Information

Thanks to @dagbay-rh for finding the initial root cause of our issue and helping here!

coderbydesign added a commit to RedHatInsights/APIcast that referenced this issue Jun 7, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
coderbydesign added a commit to coderbydesign/APIcast that referenced this issue Jun 7, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
@coderbydesign
Copy link
Contributor Author

Proposed PR in #1468

@tkan145
Copy link
Contributor

tkan145 commented Jun 8, 2024

@coderbydesign Thanks for your issue, make sense what you're reporting.
Let me have a look, and I'll come back to you in the next couple of days.

@tkan145 tkan145 closed this as completed Jun 27, 2024
tkan145 pushed a commit to tkan145/APIcast that referenced this issue Jul 9, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
tkan145 pushed a commit to tkan145/APIcast that referenced this issue Jul 10, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
tkan145 pushed a commit to tkan145/APIcast that referenced this issue Jul 17, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants