Skip to content

Module API to allow writes after key space notification hooks#11199

Merged
oranagra merged 29 commits intoredis:unstablefrom
MeirShpilraien:post_notification_jobs
Nov 24, 2022
Merged

Module API to allow writes after key space notification hooks#11199
oranagra merged 29 commits intoredis:unstablefrom
MeirShpilraien:post_notification_jobs

Conversation

@MeirShpilraien
Copy link

@MeirShpilraien MeirShpilraien commented Aug 28, 2022

Summary of API additions

  • RedisModule_AddPostNotificationJob - new API to call inside a key space notification (and on more locations in the future) and allow to add a post job as describe above.
  • New module option, REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS, allows to disable Redis protection of nested key-space notifications.
  • RedisModule_GetModuleOptionsAll - gets the mask of all supported module options so a module will be able to check if a given option is supported by the current running Redis instance.

Background

Related PRs: #10969, #11178

The following PR is a proposal of handling write operations inside module key space notifications.

After a lot of discussions we came to a conclusion that module should not perform any write operations on key space notification.

Some examples of issues that such write operation can cause are describe on the following links:

There are probably more issues that are yet to be discovered. The underline problem with writing inside key space notification is that the notification runs synchronously, this means that the notification code will be executed in the middle on Redis logic
(commands logic, eviction, expire). Redis do not assume that the data might change while running the logic and such changes can crash Redis or cause unexpected behaviour.

One solution is to go over all those places (that assumes no data changes) and fix them such that they will be tolerable to data changes. This approach will probably require a lot of changes all around the code and it will be very hard to know whether or not we got all those issues. Moreover, it will be very hard to maintain such code and be sure that we are not adding more places that assume no data changes.

Another approach (which presented in this PR) is to state that modules should not perform any write command inside key space notification (we can chose whether or not we want to force it). To still cover the use-case where module wants to perform a write operation as a reaction to key space notifications, we introduce a new API , RedisModule_AddPostNotificationJob, that allows to register a callback that will be called by Redis when the following conditions hold:

  • It is safe to perform any write operation.
  • The job will be called atomically along side the operation that triggers it (in our case, key space notification).

Module can use this new API to safely perform any write operation and still achieve atomicity between the notification and the write.

Notice that another approach that was suggested was to collect all the key space notifications and fire them at the end (where it is safe to write). This solution might be simpler to the module but it has some limitations:

  1. It can be considered a breaking change because the notifications will be fired at the end and not synchronously when they happened
  2. If the key space notification will be fire at the end, the module might miss data that it has before, one example is this PR that adds unlink notification that allows the module to read the data just before it is deleted.

We believe that the suggested API gives a good abstraction (module do not need to know the internal for Redis and decide when it safe to write, Redis calls the module when such conditions hold) with good flexibility (module can still perform logic synchronously with the notification and read the state of the database as it was at the time of the notification).

Though currently the API is supported on key space notifications, the API is written in a generic way so that in the future we will be able to use it on other places (server events for example).

Technical Details

Whenever a module uses RedisModule_AddPostNotificationJob the callback is added to a list of callbacks (called modulePostExecUnitJobs) that need to be invoke after the current execution unit ends (whether its a command, eviction, or active expire). In order to trigger those callback atomically with the notification effect, we call those callbacks on postExecutionUnitOperations (which was propagatePendingCommands before this PR). The new function fires the post jobs and then calls propagatePendingCommands.

If the callback perform more operations that triggers more key space notifications. Those keys space notifications might register more callbacks. Those callbacks will be added to the end of modulePostExecUnitJobs list and will be invoke atomically after the current callback ends. This raises a concerns of entering an infinite loops, we consider infinite loops as a logical bug that need to be fixed in the module, an attempt to protect against infinite loops by halting the execution could result in violation of the feature correctness and so Redis will make no attempt to protect the module from infinite loops

In addition, currently key space notifications are not nested. Some modules might want to allow nesting key-space notifications. To allow that and keep backward compatibility, we introduce a new module option called REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS. Setting this option will disable the Redis key-space notifications nesting protection and will pass this responsibility to the module.

Redis infrastructure

This PR promotes the existing propagatePendingCommands to an "Execution Unit" concept, which is called after each atomic unit of execution,

Cherry pick considerations and followup PR

One of the goal of this PR was to make it as safe as possible to cherry-pick to 7.0. While working on the PR we realize the code has some state duplication when it comes to command propagation to replica and AOF. We realize that server.core_propagation is probably redundant and we probably can also unify server.module_ctx_nesting and server.in_nested_call. Such changes are probably much more risky so we avoid doing them on this PR. A followup PR will cover those topics and will try to rearrange this part of the code.

TODO

  • Decide whether the approach taken on this PR makes sense and if there are more edge cases we might have missed
  • Decide whether or not we want to force modules not to write inside key space notifications - no such validation is require. Document this limitation is good enough.
  • Decide whether or not we want to force modules to use the new API only on key space notifications - no
  • Handle RM_StringDMA which is an example of read operation that might change the data (see issue describe in this comment Add a special notification unlink available only for modules #9406 (comment)) - after checking the issue we see that there is no really a problem using RM_StringDMA and its better not to change it to avoid breaking change. Test was added to verify it.
  • Consider whether or not propagatePendingCommands is the right place to fire the post notifications callbacks.
  • Move max action limitation to be a parameter that can be set by the module itself - we decided we will not have any limits
  • Improve testing to the new API

Related PRs: redis#10969, redis#11178

The following PR is a proposal of handling write operations inside key space notifications. The PR is not yet finished (see todo list at the end) but I would like to start getting opinions whether or not the approach make sense and whether there are any issues that we might have missed.

After a lot of discussions we came to a conclussion that module
should not perform any write operations on key space notifcation.

Some examples of issues that such write operation can cause are describe
on the following links:

* Bad replication oreder - redis#10969
* Used after free - redis#10969 (comment)
* Used after free - redis#9406 (comment)

There are probably more issues that are yet to be discovered. The underline problem
with writing inside key space notification is that the notification runs synchronously,
this means that the notification code will be executed in the middle on Redis logic
(commands logic, eviction, expire). Redis **do not assume** that the data might change
while running the logic and such changes can crash Redis or cause unexpected behaviure.

One solution is to go over all those places (that assumes no data changes) and fix them
such that they will be tollerable to data changes. This apporach will probably require a
lot of changes all around the code and it will be very hard to know whether or not we got
all those issues. Moreover, it will be very hard to maintain such code and be sure that
we are not adding more places that assume no data changes.

Another approach (which presented in this PR) is to state that modules **should not** perform
any write command inside key space notification (we can chose whether or not we want to force it).
To still cover the use-case where module wants to perform a write operation as a reaction to
key space notifications, we introduce a new API that, `RedisModule_AddPostNotificationJob`, allows to register a callback that will be
called by Redis when the following conditions hold:

* It is safe to perform any write operation.
* The job will be called atomically along side the notification (the notification effect and the
  job effect will be replicated to the replication and the aof wrapped with multi exec).

Module can use this new API to safely perform any write operation and still achieve atomicity
betweent the notification and the write.

### Technical Details

Whenever a module uses `RedisModule_AddPostNotificationJob` the callback is added to a list of callbacks (called `modulePostKeyspaceJobs`) that need to be invoke after the current logic ends (whether its a command, eviction, or active expire). In order to trigger those callback atomically with the notification effect, we call those callbacks on `propagatePendingCommands` just before we actually propagating the effects.

If the callback perform more operations that triggers more key space notifications. Those keys space notifications might register more callbacks. Those callbacks will be added to the end of `modulePostKeyspaceJobs` list and will be invoke atomically after the current callback ends. In order to avoid infinite loop we limit the number of jobs that can be registered in a single logic unit (command, eviction, active expire). This limitation is configurable using `maxpostnotificationsjobs` configuration value.
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commenting on the todo comment on the top

Decide whether the approach taken on this PR makes sense and if there are more edge cases we might have missed

i'm in favor of this approach

Decide whether or not we want to force modules not to write inside key space notifications

I'd like that, but it'll be a breaking change, so i think we should skip it.

Decide whether or not we want to force modules to use the new API only on key space notifications

I don't think we should bother.

Decide how to handle RM_StringDMA which is an example of read operation that might change the data

I personally think that we should allow String DMC reads (useful for huge string), but we should error (rather than corrupt memory), in case the string is not sds encoded, and document that in this case the module should resort to a non DMA read.

Consider whether or not propagatePendingCommands is the right place to fire the post notifications callbacks.

I think it's right, until the day we want to allow propagating from a non-write keyspace notification (aka, key-miss), but i think that's probably an invalid use.

Meir Shpilraien (Spielrein) and others added 2 commits September 6, 2022 14:23
@MeirShpilraien
Copy link
Author

Thanks @oranagra, I have fixed your comments / replied to them. Waiting for more comments from the community.

Copy link
Collaborator

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeirShpilraien I think this is a good direction to address the problem. I have a feeling there could be more edge cases we did not consider yet though.

I don't think we can/should prevent modules from writing inside a notification hook, clearly documenting it is enough IMHO.

@oranagra
Copy link
Member

conceptually approved in a core-team meeting. @yossigo please take a quick look to be on the safe side.


/* Declare that the module want to get nested key space notifications.
* If enabled, the module is responsible to break endless loop. */
#define REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS (1<<3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR, but I think we're missing a way for a module to determine if this option is supported or not, like RM_Get**FlagsAll.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd argue that it is related to this PR since it adds a new option and a a compatibility issue.
it's real quick to add it, i don't see a need for another PR or issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oranagra added to this PR and updated the top comment.

@oranagra
Copy link
Member

oranagra commented Nov 9, 2022

PR was conceptually approved in a core-team meeting, with the exception of #11199 (comment).
it was also approved to be eventually backported to 7.0 in due time.

@oranagra oranagra merged commit abc345a into redis:unstable Nov 24, 2022
Comment on lines +7364 to +7366
postExecutionUnitOperations();
decrRefCount(key);
propagatePendingCommands();
postExecutionUnitOperations();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeirShpilraien isn't redundant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, certainly redundant, i wonder how it slipped in....
admittedly, i only did incremental reviews, and didn't do a top to bottom one before merging.
please make a PR to fix.

Copy link
Author

@MeirShpilraien MeirShpilraien Nov 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, totally redudent, bad merge from unstable on my side (funny, I believe I also did the PR that caused the conflict and I still missed it)
Will make a PR to remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MeirShpilraien i see sundb already made the PR: #11547

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh right, greate, thanks.

oranagra pushed a commit that referenced this pull request Nov 27, 2022
Accidentally introduced when merging unstable in #11199
madolson added a commit to madolson/redis that referenced this pull request Apr 19, 2023
…11199)

### Summary of API additions

* `RedisModule_AddPostNotificationJob` - new API to call inside a key space
  notification (and on more locations in the future) and allow to add a post job as describe above.
* New module option, `REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS`,
  allows to disable Redis protection of nested key-space notifications.
* `RedisModule_GetModuleOptionsAll` - gets the mask of all supported module options so a module
  will be able to check if a given option is supported by the current running Redis instance.

### Background

The following PR is a proposal of handling write operations inside module key space notifications.
After a lot of discussions we came to a conclusion that module should not perform any write
operations on key space notification.

Some examples of issues that such write operation can cause are describe on the following links:

* Bad replication oreder - redis#10969
* Used after free - redis#10969 (comment)
* Used after free - redis#9406 (comment)

There are probably more issues that are yet to be discovered. The underline problem with writing
inside key space notification is that the notification runs synchronously, this means that the notification
code will be executed in the middle on Redis logic (commands logic, eviction, expire).
Redis **do not assume** that the data might change while running the logic and such changes
can crash Redis or cause unexpected behaviour.

The solution is to state that modules **should not** perform any write command inside key space
notification (we can chose whether or not we want to force it). To still cover the use-case where
module wants to perform a write operation as a reaction to key space notifications, we introduce
a new API , `RedisModule_AddPostNotificationJob`, that allows to register a callback that will be
called by Redis when the following conditions hold:

* It is safe to perform any write operation.
* The job will be called atomically along side the operation that triggers it (in our case, key
  space notification).

Module can use this new API to safely perform any write operation and still achieve atomicity
between the notification and the write.

Although currently the API is supported on key space notifications, the API is written in a generic
way so that in the future we will be able to use it on other places (server events for example).

### Technical Details

Whenever a module uses `RedisModule_AddPostNotificationJob` the callback is added to a list
of callbacks (called `modulePostExecUnitJobs`) that need to be invoke after the current execution
unit ends (whether its a command, eviction, or active expire). In order to trigger those callback
atomically with the notification effect, we call those callbacks on `postExecutionUnitOperations`
(which was `propagatePendingCommands` before this PR). The new function fires the post jobs
and then calls `propagatePendingCommands`.

If the callback perform more operations that triggers more key space notifications. Those keys
space notifications might register more callbacks. Those callbacks will be added to the end
of `modulePostExecUnitJobs` list and will be invoke atomically after the current callback ends.
This raises a concerns of entering an infinite loops, we consider infinite loops as a logical bug
that need to be fixed in the module, an attempt to protect against infinite loops by halting the
execution could result in violation of the feature correctness and so **Redis will make no attempt
to protect the module from infinite loops**

In addition, currently key space notifications are not nested. Some modules might want to allow
nesting key-space notifications. To allow that and keep backward compatibility, we introduce a
new module option called `REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS`.
Setting this option will disable the Redis key-space notifications nesting protection and will
pass this responsibility to the module.

### Redis infrastructure

This PR promotes the existing `propagatePendingCommands` to an "Execution Unit" concept,
which is called after each atomic unit of execution,

Co-authored-by: Oran Agra <[email protected]>
Co-authored-by: Yossi Gottlieb <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
madolson pushed a commit to madolson/redis that referenced this pull request Apr 19, 2023
Accidentally introduced when merging unstable in redis#11199
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
…11199)

### Summary of API additions

* `RedisModule_AddPostNotificationJob` - new API to call inside a key space
  notification (and on more locations in the future) and allow to add a post job as describe above.
* New module option, `REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS`,
  allows to disable Redis protection of nested key-space notifications.
* `RedisModule_GetModuleOptionsAll` - gets the mask of all supported module options so a module
  will be able to check if a given option is supported by the current running Redis instance.

### Background

The following PR is a proposal of handling write operations inside module key space notifications.
After a lot of discussions we came to a conclusion that module should not perform any write
operations on key space notification.

Some examples of issues that such write operation can cause are describe on the following links:

* Bad replication oreder - redis#10969
* Used after free - redis#10969 (comment)
* Used after free - redis#9406 (comment)

There are probably more issues that are yet to be discovered. The underline problem with writing
inside key space notification is that the notification runs synchronously, this means that the notification
code will be executed in the middle on Redis logic (commands logic, eviction, expire).
Redis **do not assume** that the data might change while running the logic and such changes
can crash Redis or cause unexpected behaviour.

The solution is to state that modules **should not** perform any write command inside key space
notification (we can chose whether or not we want to force it). To still cover the use-case where
module wants to perform a write operation as a reaction to key space notifications, we introduce
a new API , `RedisModule_AddPostNotificationJob`, that allows to register a callback that will be
called by Redis when the following conditions hold:

* It is safe to perform any write operation.
* The job will be called atomically along side the operation that triggers it (in our case, key
  space notification).

Module can use this new API to safely perform any write operation and still achieve atomicity
between the notification and the write.

Although currently the API is supported on key space notifications, the API is written in a generic
way so that in the future we will be able to use it on other places (server events for example).

### Technical Details

Whenever a module uses `RedisModule_AddPostNotificationJob` the callback is added to a list
of callbacks (called `modulePostExecUnitJobs`) that need to be invoke after the current execution
unit ends (whether its a command, eviction, or active expire). In order to trigger those callback
atomically with the notification effect, we call those callbacks on `postExecutionUnitOperations`
(which was `propagatePendingCommands` before this PR). The new function fires the post jobs
and then calls `propagatePendingCommands`.

If the callback perform more operations that triggers more key space notifications. Those keys
space notifications might register more callbacks. Those callbacks will be added to the end
of `modulePostExecUnitJobs` list and will be invoke atomically after the current callback ends.
This raises a concerns of entering an infinite loops, we consider infinite loops as a logical bug
that need to be fixed in the module, an attempt to protect against infinite loops by halting the
execution could result in violation of the feature correctness and so **Redis will make no attempt
to protect the module from infinite loops**

In addition, currently key space notifications are not nested. Some modules might want to allow
nesting key-space notifications. To allow that and keep backward compatibility, we introduce a
new module option called `REDISMODULE_OPTIONS_ALLOW_NESTED_KEYSPACE_NOTIFICATIONS`.
Setting this option will disable the Redis key-space notifications nesting protection and will
pass this responsibility to the module.

### Redis infrastructure

This PR promotes the existing `propagatePendingCommands` to an "Execution Unit" concept,
which is called after each atomic unit of execution,

Co-authored-by: Oran Agra <[email protected]>
Co-authored-by: Yossi Gottlieb <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
Accidentally introduced when merging unstable in redis#11199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants