Consistent erros returned from EVAL scripts by yoav-steinberg · Pull Request #10218 · redis/redis

yoav-steinberg · 2022-01-31T10:50:20Z

This PR handles inconsistencies in errors returned from lua scripts. Details of the problem can be found in #10165.

Changes

Remove double stack trace. It's enough that a stack trace is automatically added by the engine's error handler see

Lines 472 to 485 in d0bc4ff

    
           char *errh_func =       "local dbg = debug\n" 
        
                                   "local error_handler = function (err)\n" 
        
                                   "  local i = dbg.getinfo(2,'nSl')\n" 
        
                                   "  if i and i.what == 'C' then\n" 
        
                                   "    i = dbg.getinfo(3,'nSl')\n" 
        
                                   "  end\n" 
        
                                   "  if i then\n" 
        
                                   "    return i.source .. ':' .. i.currentline .. ': ' .. err\n" 
        
                                   "  else\n" 
        
                                   "    return err\n" 
        
                                   "  end\n" 
        
                                   "end\n" 
        
                                   "return error_handler"; 
        
           luaL_loadbuffer(lua_engine_ctx->lua, errh_func, strlen(errh_func), "@err_handler_def");

and

redis/src/eval.c

Lines 243 to 255 in d0bc4ff

    
           char *errh_func =       "local dbg = debug\n" 
        
                                   "function __redis__err__handler(err)\n" 
        
                                   "  local i = dbg.getinfo(2,'nSl')\n" 
        
                                   "  if i and i.what == 'C' then\n" 
        
                                   "    i = dbg.getinfo(3,'nSl')\n" 
        
                                   "  end\n" 
        
                                   "  if i then\n" 
        
                                   "    return i.source .. ':' .. i.currentline .. ': ' .. err\n" 
        
                                   "  else\n" 
        
                                   "    return err\n" 
        
                                   "  end\n" 
        
                                   "end\n"; 
        
           luaL_loadbuffer(lua,errh_func,strlen(errh_func),"@err_handler_def");

Make sure all errors a preceded with an error code. Passing a simple string to luaPushError() will prepend it with a generic ERR error code.

Make sure lua error table doesn't include a RESP - error status. Lua stores redis error's as a lua table with a single err field and a string. When the string is translated back to RESP we add a - to it. See

redis/src/script_lua.c

Lines 510 to 517 in d0bc4ff

    
           lua_pushstring(lua,"err"); 
        
           lua_gettable(lua,-2); 
        
           t = lua_type(lua,-1); 
        
           if (t == LUA_TSTRING) { 
        
               addReplyErrorFormat(c,"-%s",lua_tostring(lua,-1)); 
        
               lua_pop(lua,2); 
        
               return; 
        
           }

So there's no need to store it in the lua table.

Before & After

--- <unnamed>
+++ <unnamed>
@@ -1,14 +1,14 @@
  1: config set maxmemory 1
  2: +OK
  3: eval "return redis.call('set','x','y')" 0
- 4: -ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: @user_script: 1: -OOM command not allowed when used memory > 'maxmemory'.
+ 4: -ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'.
  5: eval "return redis.pcall('set','x','y')" 0
- 6: -@user_script: 1: -OOM command not allowed when used memory > 'maxmemory'.
+ 6: -OOM command not allowed when used memory > 'maxmemory'.
  7: eval "return redis.call('select',99)" 0
  8: -ERR Error running script (call to 4ad5abfc50bbccb484223905f9a16f09cd043ba8): @user_script:1: ERR DB index is out of range
  9: eval "return redis.pcall('select',99)" 0
 10: -ERR DB index is out of range
 11: eval_ro "return redis.call('set','x','y')" 0
-12: -ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: @user_script: 1: Write commands are not allowed from read-only scripts.
+12: -ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: ERR Write commands are not allowed from read-only scripts.
 13: eval_ro "return redis.pcall('set','x','y')" 0
-14: -@user_script: 1: Write commands are not allowed from read-only scripts.
+14: -ERR Write commands are not allowed from read-only scripts.

to do

tests

- Remove double stack trace. - Make sure all errors a preceded with an error code. - Make sure lua error table doesn't include a RESP '-' error status.

oranagra · 2022-02-03T08:48:16Z

@yoav-steinberg maybe some tests are needed?
i.e. verify the expected result is visible in the Lua table (check in the Lua code), and also in the client (check in the Tcl code).
testing both cases where the error returns from the command proc, and also cases where it is pushed by the scriptCall.

can you please also update the top comment with some examples of what was broken and how it looks now, so that we can realize if this is a breaking change or what's the impact.

@MeirShpilraien please review.

yoav-steinberg · 2022-02-03T09:41:08Z

can you please also update the top comment with some examples of what was broken and how it looks now

Done.

MeirShpilraien

👍

src/script_lua.c

yoav-steinberg · 2022-02-07T13:49:02Z

@yoav-steinberg maybe some tests are needed? i.e. verify the expected result is visible in the Lua table (check in the Lua code), and also in the client (check in the Tcl code). testing both cases where the error returns from the command proc, and also cases where it is pushed by the scriptCall.

Done.

This PR fix 2 issues on Lua scripting: * Server error reply statistics (some errors were counted twice). * Error code and error strings returning from scripts (error code was missing / misplaced). ## Statistics a Lua script user is considered part of the user application, a sophisticated transaction, so we want to count an error even if handled silently by the script, but when it is propagated outwards from the script we don't wanna count it twice. on the other hand, if the script decides to throw an error on its own (using `redis.error_reply`), we wanna count that too. Besides, we do count the `calls` in command statistics for the commands the script calls, we we should certainly also count `failed_calls`. So when a simple `eval "return redis.call('set','x','y')" 0` fails, it should count the failed call to both SET and EVAL, but the `errorstats` and `total_error_replies` should be counted only once. The PR changes the error object that is raised on errors. Instead of raising a simple Lua string, Redis will raise a Lua table in the following format: ``` { err='<error message (including error code)>', source='<User source file name>', line='<line where the error happned>', ignore_error_stats_update=true/false, } ``` The `luaPushError` function was modified to construct the new error table as describe above. The `luaRaiseError` was renamed to `luaError` and is now simply called `lua_error` to raise the table on the top of the Lua stack as the error object. The reason is that since its functionality is changed, in case some Redis branch / fork uses it, it's better to have a compilation error than a bug. The `source` and `line` fields are enriched by the error handler (if possible) and the `ignore_error_stats_update` is optional and if its not present then the default value is `false`. If `ignore_error_stats_update` is true, the error will not be counted on the error stats. When parsing Redis call reply, each error is translated to a Lua table on the format describe above and the `ignore_error_stats_update` field is set to `true` so we will not count errors twice (we counted this error when we invoke the command). The changes in this PR might have been considered as a breaking change for users that used Lua `pcall` function. Before, the error was a string and now its a table. To keep backward comparability the PR override the `pcall` implementation and extract the error message from the error table and return it. Example of the error stats update: ``` 127.0.0.1:6379> lpush l 1 (integer) 2 127.0.0.1:6379> eval "return redis.call('get', 'l')" 0 (error) WRONGTYPE Operation against a key holding the wrong kind of value. script: e471b73f1ef44774987ab00bdf51f21fd9f7974a, on @user_script:1. 127.0.0.1:6379> info Errorstats # Errorstats errorstat_WRONGTYPE:count=1 127.0.0.1:6379> info commandstats # Commandstats cmdstat_eval:calls=1,usec=341,usec_per_call=341.00,rejected_calls=0,failed_calls=1 cmdstat_info:calls=1,usec=35,usec_per_call=35.00,rejected_calls=0,failed_calls=0 cmdstat_lpush:calls=1,usec=14,usec_per_call=14.00,rejected_calls=0,failed_calls=0 cmdstat_get:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=1 ``` ## error message We can now construct the error message (sent as a reply to the user) from the error table, so this solves issues where the error message was malformed and the error code appeared in the middle of the error message: ```diff 127.0.0.1:6379> eval "return redis.call('set','x','y')" 0 -(error) ERR Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479): @user_script:1: OOM command not allowed when used memory > 'maxmemory'. +(error) OOM command not allowed when used memory > 'maxmemory' @user_script:1. Error running script (call to 71e6319f97b0fe8bdfa1c5df3ce4489946dda479) ``` ```diff 127.0.0.1:6379> eval "redis.call('get', 'l')" 0 -(error) ERR Error running script (call to f_8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1): @user_script:1: WRONGTYPE Operation against a key holding the wrong kind of value +(error) WRONGTYPE Operation against a key holding the wrong kind of value script: 8a705cfb9fb09515bfe57ca2bd84a5caee2cbbd1, on @user_script:1. ``` Notica that `redis.pcall` was not change: ``` 127.0.0.1:6379> eval "return redis.pcall('get', 'l')" 0 (error) WRONGTYPE Operation against a key holding the wrong kind of value ``` ## other notes Notice that Some commands (like GEOADD) changes the cmd variable on the client stats so we can not count on it to update the command stats. In order to be able to update those stats correctly we needed to promote `realcmd` variable to be located on the client struct. Tests was added and modified to verify the changes. Related PR's: #10279, #10218, #10278, #10309 Co-authored-by: Oran Agra <[email protected]>

Fix lua script error reporting:

8850d37

- Remove double stack trace. - Make sure all errors a preceded with an error code. - Make sure lua error table doesn't include a RESP '-' error status.

yoav-steinberg requested review from MeirShpilraien and oranagra January 31, 2022 10:51

oranagra linked an issue Feb 1, 2022 that may be closed by this pull request

[BUG] Inconsistency in errors returned from EVAL scripts. #10165

Closed

oranagra added the 7.0-must-have label Feb 1, 2022

oranagra approved these changes Feb 3, 2022

View reviewed changes

MeirShpilraien approved these changes Feb 3, 2022

View reviewed changes

yoav-steinberg commented Feb 3, 2022

View reviewed changes

src/script_lua.c Outdated Show resolved Hide resolved

src/script_lua.c Outdated Show resolved Hide resolved

Apply suggestions from code review

b56be26

oranagra added the state:needs-test-code the PR is missing test code label Feb 3, 2022

yoav-steinberg added 4 commits February 7, 2022 11:45

Merge branch 'unstable' into consistent_script_err

5c724c0

Remove crlf from end of error objects when they're used in scripts.

677fc72

Tests for verifying scripting error handling.

4a787b4

typo

0784984

skip test using in cluster mode

999743a

oranagra approved these changes Feb 7, 2022

View reviewed changes

oranagra merged commit b76016a into redis:unstable Feb 8, 2022

oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Feb 10, 2022

oranagra mentioned this pull request Feb 10, 2022

Script reply errors cleanup and stats #10279

Closed

MeirShpilraien mentioned this pull request Feb 22, 2022

Sort out the mess around Lua error messages and error stats #10329

Merged

oranagra mentioned this pull request Feb 28, 2022

Release 7.0 RC2 #10355

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent erros returned from EVAL scripts#10218

Consistent erros returned from EVAL scripts#10218
oranagra merged 7 commits intoredis:unstablefrom
yoav-steinberg:consistent_script_err

yoav-steinberg commented Jan 31, 2022 •

edited by oranagra

Loading

Uh oh!

oranagra commented Feb 3, 2022

Uh oh!

yoav-steinberg commented Feb 3, 2022 •

edited

Loading

Uh oh!

MeirShpilraien left a comment

Uh oh!

Uh oh!

Uh oh!

yoav-steinberg commented Feb 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	char *errh_func = "local dbg = debug\n"
	"local error_handler = function (err)\n"
	" local i = dbg.getinfo(2,'nSl')\n"
	" if i and i.what == 'C' then\n"
	" i = dbg.getinfo(3,'nSl')\n"
	" end\n"
	" if i then\n"
	" return i.source .. ':' .. i.currentline .. ': ' .. err\n"
	" else\n"
	" return err\n"
	" end\n"
	"end\n"
	"return error_handler";
	luaL_loadbuffer(lua_engine_ctx->lua, errh_func, strlen(errh_func), "@err_handler_def");

	char *errh_func = "local dbg = debug\n"
	"function __redis__err__handler(err)\n"
	" local i = dbg.getinfo(2,'nSl')\n"
	" if i and i.what == 'C' then\n"
	" i = dbg.getinfo(3,'nSl')\n"
	" end\n"
	" if i then\n"
	" return i.source .. ':' .. i.currentline .. ': ' .. err\n"
	" else\n"
	" return err\n"
	" end\n"
	"end\n";
	luaL_loadbuffer(lua,errh_func,strlen(errh_func),"@err_handler_def");

	lua_pushstring(lua,"err");
	lua_gettable(lua,-2);
	t = lua_type(lua,-1);
	if (t == LUA_TSTRING) {
	addReplyErrorFormat(c,"-%s",lua_tostring(lua,-1));
	lua_pop(lua,2);
	return;
	}

Conversation

yoav-steinberg commented Jan 31, 2022 • edited by oranagra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Before & After

to do

Uh oh!

oranagra commented Feb 3, 2022

Uh oh!

yoav-steinberg commented Feb 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MeirShpilraien left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yoav-steinberg commented Feb 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yoav-steinberg commented Jan 31, 2022 •

edited by oranagra

Loading

yoav-steinberg commented Feb 3, 2022 •

edited

Loading