builtins.readFile: do not truncate content#3546
Merged
edolstra merged 3 commits intoNixOS:masterfrom May 6, 2020
Merged
Conversation
This closes NixOS#3026 by allowing `builtins.readFile` to read a file with a wrongly reported file size, for example, files in `/proc` may report a file size of 0. Reading file in `/proc` is not a good enough motivation, however I do think it just makes nix more robust by allowing more file to be read. Especially, I do considerer the previous behavior to be dangerous because nix was previously reading truncated files. Examples of file system which incorrectly report file size may be network file system or dynamic file system (for performance reason, a dynamic file system such as FUSE may generate the content of the file on demand). ``` nix-repl> builtins.readFile "/proc/version" "" ``` With this commit: ``` nix-repl> builtins.readFile "/proc/version" "Linux version 5.6.7 (nixbld@localhost) (gcc version 9.3.0 (GCC)) NixOS#1-NixOS SMP Thu Apr 23 08:38:27 UTC 2020\n" ``` Here is a summary of the behavior changes: - If the reported size is smaller, previous implementation was silently returning a truncated file content. The new implementation is returning the correct file content. - If a file had a bigger reported file size, previous implementation was failing with an exception, but the new implementation is returning the correct file content. This change of behavior is coherent with this pull request. Open questions - The behavior is unchanged for correctly reported file size, however performances may vary because it uses the more complex sink interface. Considering that sink is used a lot, I don't think this impacts the performance a lot. - `builtins.readFile` on an infinite file, such as `/dev/random` may fill the memory. - it does not support adding file to store, such as `${/proc/version}`.
Member
|
Note that this makes the where There does appear to be a small but measurable speed penalty btw, e.g. running Don't know if it's because of the increased number of system calls or having to reallocate the std::string contents. |
Contributor
Author
|
@edolstra how do you run the performance benchmark? I'll try to see if the I'll remove the |
Now it is always `drain` (see previous commit).
When used with `readFile`, we have a pretty good heuristic of the file size, so `reserve` this in the `string`. This will save some allocation / copy when the string is growing.
Contributor
Author
|
@edolstra |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This closes #3026 by allowing
builtins.readFileto read a file with awrongly reported file size, for example, files in
/procmay report afile size of 0. Reading file in
/procis not a good enough motivation,however I do think it just makes nix more robust by allowing more file
to be read. Especially, I do considerer the previous behavior to be
dangerous because nix was previously reading truncated files. Examples
of file system which incorrectly report file size may be network file
system or dynamic file system (for performance reason, a dynamic file
system such as FUSE may generate the content of the file on demand).
With this commit:
Here is a summary of the behavior changes:
If the reported size is smaller, previous implementation
was silently returning a truncated file content. The new implementation
is returning the correct file content.
If a file had a bigger reported file size, previous implementation was
failing with an exception, but the new implementation is returning the
correct file content. This change of behavior is coherent with this pull
request.
Open questions
performances may vary because it uses the more complex sink interface.
Considering that sink is used a lot, I don't think this impacts the
performance a lot.
builtins.readFileon an infinite file, such as/dev/randommayfill the memory.
${/proc/version}.