Make cluster config file saving atomic and fsync acl#10924
Make cluster config file saving atomic and fsync acl#10924oranagra merged 2 commits intoredis:unstablefrom
Conversation
| if (write(fd,ci,sdslen(ci)) != (ssize_t)sdslen(ci)) goto err; | ||
|
|
||
| if (do_fsync) { | ||
| server.cluster->todo_before_sleep &= ~CLUSTER_TODO_FSYNC_CONFIG; |
There was a problem hiding this comment.
i'm not certain this (clearing CLUSTER_TODO_FSYNC_CONFIG) should still be here.
we're fsyncing a temp file.
maybe move it to after the rename and directory fsync?
on the other hand, i see we always used to clear CLUSTER_TODO_SAVE_CONFIG even before opening the file for writing, so maybe the very attempt to persist / fsync is enough to clear these flags.
There was a problem hiding this comment.
I think server.cluster->todo_before_sleep &= ~CLUSTER_TODO_FSYNC_CONFIG is only relative to do_fsync, nothing to do with the order. do_fsync indicates server.cluster->todo_before_sleep may have CLUSTER_TODO_FSYNC_CONFIG set, to avoid repeated fsync next round, we should clear CLUSTER_TODO_FSYNC_CONFIG bit in server.cluster->todo_before_sleep when do_fsync is set.
| return 0; | ||
|
|
||
| err: | ||
| if (do_fsync) { |
There was a problem hiding this comment.
i'm not to keen about the do_fsync argument and it's purpose, and whether or not it should control the directory fsync as well.
i see in some places we set both CLUSTER_TODO_SAVE_CONFIG and CLUSTER_TODO_FSYNC_CONFIG.
and in others we only set CLUSTER_TODO_SAVE_CONFIG.
so this looks right, but i'd like some confirmation.
There was a problem hiding this comment.
IMO CLUSTER_TODO_SAVE_CONFIG with CLUSTER_TODO_FSYNC_CONFIG indicates it's a critical state update, e.g. a new config epoch, slots reassignment and so on, which's failure may result in cluster not work correctly. In contrast, a single CLUSTER_TODO_SAVE_CONFIG means it better save the config, but failure doesn't matter severely, while the state can recover automatically in future, for example, the flag of a node. If I'm wrong plz @madolson correct me.
There was a problem hiding this comment.
My understanding is consistent with @uvletter, although I'll be transparent and say we don't use fsync at AWS based on our architecture, so it might have some caveats I'm not that familiar with. The CLUSTER_TODO_FSYNC_CONFIG just means that if we fail to fsync we should crash since we may no longer be in a recoverable state, so updating last ping is fine to skip but bumping epoch we need to fsync.
|
@oranagra I think this behavior change should be part of 7.2. |
|
details about why this is a behavior change: #7824 (comment) |
As an outstanding part mentioned in redis#10737, we could just make the cluster config file and ACL file saving done with a more safe and atomic pattern (write to temp file, fsync, rename, fsync dir). The cluster config file uses an in-place overwrite and truncation (which was also used by the main config file before redis#7824). The ACL file is using the temp file and rename approach, but was missing an fsync. Co-authored-by: 朱天 <[email protected]>
As an outstanding part mentioned in #10737, we could just make the cluster config file and ACL file saving done with a more safe and atomic pattern (write to temp file, fsync, rename, fsync dir).
The cluster config file uses an in-place overwrite and truncation (which was also used by the main config file before #7824).
The ACL file is using the temp file and rename approach, but was missing an fsync.