Sort out the mess around Lua error messages and error stats#10329
Merged
oranagra merged 9 commits intoredis:unstablefrom Feb 27, 2022
Merged
Sort out the mess around Lua error messages and error stats#10329oranagra merged 9 commits intoredis:unstablefrom
oranagra merged 9 commits intoredis:unstablefrom
Conversation
* Server error reply statistics.
* Error code and error strings returning from scripts.
The PR changes the error object that is raised on errors.
Instead of raising a simple Lua string, Redis will raise
a Lua table in the following format:
```
{
err='<error message (without error code)>',
err_code='<error code>',
source='<User source file name>',
line='<line where the error happned>',
ignore_error_stats_update=true/false,
}
```
The `luaPushError` function was modified to construct
the new error table as describe above. The `luaRaiseError`
is now simply calls `lua_error` to raise the table on the
top of the Lua stack as the error object.
The `source` and `line` fields are enriched by the error
hanlder (if possible) and the `ignore_error_stats_update`
is optional and if its not present then the default value
is `false`. If `ignore_error_stats_update` is true, the
error will not be counted on the error stats.
When parsing Redis call reply, each error is translated
to a Lua table on the format describe above and the
`ignore_error_stats_update` field is set to `true` so
we will not count errors twice (we counted this error
when we invoke the command).
Example of the error stats update:
```
127.0.0.1:6379> lpush l 1
(integer) 2
127.0.0.1:6379> eval "return redis.call('get', 'l')" 0
(error) WRONGTYPE Operation against a key holding the wrong kind of value. script: e471b73f1ef44774987ab00bdf51f21fd9f7974a, on @user_script:1.
127.0.0.1:6379> info Errorstats
# Errorstats
errorstat_WRONGTYPE:count=1
127.0.0.1:6379> info commandstats
# Commandstats
cmdstat_eval:calls=1,usec=341,usec_per_call=341.00,rejected_calls=0,failed_calls=1
cmdstat_info:calls=1,usec=35,usec_per_call=35.00,rejected_calls=0,failed_calls=0
cmdstat_lpush:calls=1,usec=14,usec_per_call=14.00,rejected_calls=0,failed_calls=0
cmdstat_get:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=1
```
In addition we can now construct the error message (sent
as a reply to the user) from the error table, so this solves
issues where the error message was malformed and the error
code appeared in the middle of the error message:
```
127.0.0.1:6379> eval "return redis.call('set','x','y')" 0
-(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'.
+(error) OOM command not allowed when used memory > 'maxmemory'.. @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479)
```
The changes in this PR might be considered as a breaking change
for users that used Lua `pcall` function. Before the error was a
strind and now its a table. To keep backward comparability the PR
overide the `pcall` implementation and extract the error message
from the error table and return it.
Tests was added and modified to verify the changes.
MeirShpilraien
commented
Feb 22, 2022
MeirShpilraien
commented
Feb 22, 2022
oranagra
reviewed
Feb 22, 2022
added 3 commits
February 23, 2022 14:37
* Better comments. * Propagate flags to `afterErrorReply` and move stats update out of scripts units. * Renamed luaRaiseError -> luaError * Improve tests
oranagra
reviewed
Feb 23, 2022
Co-authored-by: Oran Agra <[email protected]>
The realcmd holds the original command that was executed by the client. We need this field to update error stats, we can not update the error stats directly on client->cmd because this field might change during the command invocation (like in GEOADD for example).
oranagra
reviewed
Feb 24, 2022
1. Rename incrCommandFailedCalls -> incrCommandStatsOnError and return whether or not a stats value was updated 2. Revert change or `err` not including the error code.
oranagra
approved these changes
Feb 27, 2022
Member
yossigo
approved these changes
Feb 27, 2022
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related PR's: #10279, #10218, #10278, #10309
This PR fix 2 issues on Lua scripting:
Statistics
a Lua script user is considered part of the user application, a sophisticated transaction, so we want to count an error even if handled silently by the script, but when it is propagated outwards from the script we don't wanna count it twice. on the other hand, if the script decides to throw an error on its own (using
redis.error_reply), we wanna count that too.Besides, we do count the
callsin command statistics for the commands the script calls, we we should certainly also countfailed_calls.So when a simple
eval "return redis.call('set','x','y')" 0fails, it should count the failed call to both SET and EVAL, but theerrorstatsandtotal_error_repliesshould be counted only once.The PR changes the error object that is raised on errors. Instead of raising a simple Lua string, Redis will raise
a Lua table in the following format:
The
luaPushErrorfunction was modified to construct the new error table as describe above. TheluaRaiseErrorwas renamed toluaErrorand is now simply calledlua_errorto raise the table on the top of the Lua stack as the error object.The reason is that since its functionality is changed, in case some Redis branch / fork uses it, it's better to have a compilation error than a bug.
The
sourceandlinefields are enriched by the error handler (if possible) and theignore_error_stats_updateis optional and if its not present then the default value is
false. Ifignore_error_stats_updateis true, theerror will not be counted on the error stats.
When parsing Redis call reply, each error is translated to a Lua table on the format describe above and the
ignore_error_stats_updatefield is set totrueso we will not count errors twice (we counted this errorwhen we invoke the command).
The changes in this PR might have been considered as a breaking change for users that used Lua
pcallfunction. Before, the error was a string and now its a table. To keep backward comparability the PR override thepcallimplementation and extract the error message from the error table and return it.Example of the error stats update:
error message
We can now construct the error message (sent as a reply to the user) from the error table, so this solves
issues where the error message was malformed and the error code appeared in the middle of the error message:
127.0.0.1:6379> eval "return redis.call('set','x','y')" 0 -(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'. +(error) OOM command not allowed when used memory > 'maxmemory' @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479)127.0.0.1:6379> eval "redis.call('get', 'l')" 0 -(error) ERR Error running script (call to f_8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1): @user_script:1: WRONGTYPE Operation against a key holding the wrong kind of value +(error) WRONGTYPE Operation against a key holding the wrong kind of value script: 8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1, on @user_script:1.Notica that
redis.pcallwas not change:other notes
Notice that Some commands (like GEOADD) changes the cmd variable on the client stats so we can not count on it to update the command stats. In order to be able to update those stats correctly we needed to promote
realcmdvariable to be located on the client struct.Tests was added and modified to verify the changes.