[Merged by Bors] - Write validator definitions atomically#2338
[Merged by Bors] - Write validator definitions atomically#2338michaelsproul wants to merge 2 commits intosigp:unstablefrom
Conversation
9d60924 to
75da0ed
Compare
|
MVP for reproducing the bug: use std::fs::File;
use std::io::Write;
fn main() {
let mut f1 = File::create("/lol/good_file.txt").expect("creating good file failed");
let mut f2 = File::create("/lol/junk_file.txt").expect("creating bad file failed");
let megabyte_of_crap = vec![0xff; 1024 * 1024];
f2.write_all(&megabyte_of_crap)
.expect("writing bad file failed");
f1.write_all(b"This is going to go missing :(")
.expect("writing good file failed");
}For use with a 1MB tempfs mounted on If you create I used the same tempfs setup to simulate a validator datadir running out of space, and was not able to reproduce the bug using the updated code, implying that it is fixed. |
|
I've pushed commit 9d60924 to the Pyrmont nodes and the VCs restarted twice without issue (once from old->new, once from new->new). |
paulhauner
left a comment
There was a problem hiding this comment.
This generally looks good, but I have a comment regarding permissions :)
| create_with_600_perms(&temp_path, &bytes).map_err(Error::UnableToWriteTempFile)?; | ||
|
|
||
| // With the temporary file created, perform an atomic rename. | ||
| fs::rename(&temp_path, &config_path).map_err(Error::UnableToRenameFile)?; |
There was a problem hiding this comment.
Although I haven't tested directly, I think this will technically be a breaking change since we will replace the current perms of the non-temp file with 600.
In effect, Lighthouse will continuously force its own file permissions upon that file. I'm not sure this is necessarily a bad thing, but perhaps it might hurt people with fancy setups?
Copying the permissions of the non-temp file to the temp file seems like the least-breaky approach, but I imagine that might get complex in a cross-platform world. Perhaps you've considered this already?
There was a problem hiding this comment.
Ah, good point. I'll look into this further tomorrow.
@ethDreamer if you have a sec and would like to chime in on the windows situation that would be great (but no worries if you're busy).
There was a problem hiding this comment.
I think I've found a good solution for the permissions that will work cross platform, but doesn't require us to block this PR on Windows support. The function write_file_via_temporary uses fs::copy to copy from primary file to temp (which copies permissions), writes to the temp file, and then moves it into place.
paulhauner
left a comment
There was a problem hiding this comment.
Looks good to me, nice fix!
There's a potential for some weirdness during concurrent execution, but that's not introduced by this PR nor do I think it poses a material risk.
Merge at will!
|
Thank you! bors r+ |
## Issue Addressed Closes #2159 ## Proposed Changes Rather than trying to write the validator definitions to disk directly, use a temporary file called `.validator_defintions.yml.tmp` and then atomically rename it to `validator_definitions.yml`. This avoids truncating the primary file, which can cause permanent damage when the disk is full. The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords. ## Additional Info * `File::create` truncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.create * `fs::rename` uses `rename` on UNIX and `MoveFileEx` on Windows: https://doc.rust-lang.org/std/fs/fn.rename.html * UNIX `rename` call is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fs * Windows `MoveFileEx` is _not_ atomic in general, and Windows lacks any clear API for atomic file renames :( https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows ## Further Work * Consider whether we want to try a different Windows syscall as part of #2333. The `rust-atomicwrites` crate seems promising, but actually uses the same syscall under the hood presently: untitaker/rust-atomicwrites#27.
|
Pull request successfully merged into unstable. Build succeeded: |
Issue Addressed
Closes #2159
Proposed Changes
Rather than trying to write the validator definitions to disk directly, use a temporary file called
.validator_defintions.yml.tmpand then atomically rename it tovalidator_definitions.yml. This avoids truncating the primary file, which can cause permanent damage when the disk is full.The same treatment is also applied to the validator key cache, although the situation is less dire if it becomes corrupted because it can just be deleted without the user having to reimport keys or resupply passwords.
Additional Info
File::createtruncates upon opening: https://doc.rust-lang.org/std/fs/struct.File.html#method.createfs::renameusesrenameon UNIX andMoveFileExon Windows: https://doc.rust-lang.org/std/fs/fn.rename.htmlrenamecall is atomic: https://unix.stackexchange.com/questions/322038/is-mv-atomic-on-my-fsMoveFileExis not atomic in general, and Windows lacks any clear API for atomic file renames :(https://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows
Further Work
rust-atomicwritescrate seems promising, but actually uses the same syscall under the hood presently: MoveFileEx might be not atomic untitaker/rust-atomicwrites#27.