-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
cksum/hashsum: refactor the common code. #6431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sorry for the big commit but as you can imagine, it wasn't possible to split it :( |
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
BenWiederhake
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have the time to go though all the code itself, but the tests alone are a great starting point :D
- A filename confusion, which might allow an attacker to pass verification by simply staging a "correct" file at the confused filename
- A few missed errors about silly flag combinations
- It seems like you removed some of the flag parsing code in hashsum? I can't see where some of the flags get stored.
- Some forgotten debug prints in test code.
| pub const STATUS: &str = "status"; | ||
| pub const WARN: &str = "warn"; | ||
| pub const IGNORE_MISSING: &str = "ignore-missing"; | ||
| pub const QUIET: &str = "quiet"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is named "refactor the common code" and you implement 4 new features? XD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i had to :(
because one of them didn't implemented the option of the other
BenWiederhake
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Removing the "changes requested" flag due to administrative reasons, even though I'm still requesting the above changes.)
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
Summary of the change: * Move the common code into checksum * Create a structure HashAlgorithm to handle the algorithm (instead of the 3 variables) * Use the same function for cksum & hashsum for --check (perform_checksum_validation) * Use the same for function for the hash generation (digest_reader) * Add unit tests * Add integration tests * Fix some incorrect tests
We have 3 different kinds of input: * "algo (filename) = checksum" example: `BLAKE2 (a) = bedfbb90d858c2d67b7ee8f7523be3d3b54004ef9e4f02f2ad79a1d05bfdfe49b81e3c92ebf99b504102b6bf003fa342587f5b3124c205f55204e8c4b4ce7d7c` * "checksum filename" example: `60b725f10c9c85c70d97880dfe8191b3 a` * "checksum filename" example: `60b725f10c9c85c70d97880dfe8191b3 a` These algo/regexp are tricky as files can be called "a, " b", " ", or "*c". We look at the first time to analyze the kind of input and reuse the same regexp then.
|
GNU testsuite comparison: |
|
@cakebaker it is ready for review. :) |
|
GNU testsuite comparison: |
| const ALGO_BASED_REGEX: &str = r"^\s*\\?(?P<algo>(?:[A-Z0-9]+|BLAKE2b))(?:-(?P<bits>\d+))?\s?\((?P<filename>.*)\)\s*=\s*(?P<checksum>[a-fA-F0-9]+)$"; | ||
| const DOUBLE_SPACE_REGEX: &str = r"^(?P<checksum>[a-fA-F0-9]+)\s{2}(?P<filename>.*)$"; | ||
|
|
||
| // In this case, we ignore the * | ||
| const SINGLE_SPACE_REGEX: &str = r"^(?P<checksum>[a-fA-F0-9]+)\s(?P<filename>\*?.*)$"; | ||
|
|
||
| fn get_filename_for_output(filename: &OsStr, input_is_stdin: bool) -> String { | ||
| if input_is_stdin { | ||
| "standard input" | ||
| } else { | ||
| filename.to_str().unwrap() | ||
| } | ||
| .maybe_quote() | ||
| .to_string() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would swap the consts and the get_filename_for_output function so that the consts are next to the determine_regex function.
| filename: &OsStr, | ||
| input_is_stdin: bool, | ||
| lines: &[String], | ||
| ) -> UResult<(Regex, bool)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of this return type and you already have to workaround it in the perform_checksum_validation function ;-) But I think fixing it is something for a future PR.
| #[allow(clippy::too_many_arguments)] | ||
| #[allow(clippy::cognitive_complexity)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i will improve this in a future PR :)
| Some(Some(bits_value / 8)) | ||
| } else { | ||
| properly_formatted = false; | ||
| None // Return None to signal a parsing or divisibility issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in case of a parse issue you don't reach this code.
| None // Return None to signal a parsing or divisibility issue | |
| None // Return None to signal a divisibility issue |
| // When a specific algorithm name is input, use it and default bits to None | ||
| (a.to_lowercase(), length_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the last part of the comment, "and default bits to None", is incorrect.
| util_name(), | ||
| filename_input.maybe_quote(), | ||
| ); | ||
| //skip_summary = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| //skip_summary = true; |
| } | ||
| } | ||
|
|
||
| /// Calculates the length of the digest for the given algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Calculates the length of the digest for the given algorithm. | |
| /// Calculates the length of the digest. |
| //check: bool, | ||
| tag: bool, | ||
| nonames: bool, | ||
| status: bool, | ||
| quiet: bool, | ||
| strict: bool, | ||
| warn: bool, | ||
| //status: bool, | ||
| //quiet: bool, | ||
| //strict: bool, | ||
| //warn: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for commenting them out instead of removing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because i will implement them next
src/uu/hashsum/src/hashsum.rs
Outdated
| //let strict = matches.get_flag("strict"); | ||
| let warn = matches.get_flag("warn") && !status; | ||
| let zero = matches.get_flag("zero"); | ||
| let zero: bool = matches.get_flag("zero"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let zero: bool = matches.get_flag("zero"); | |
| let zero = matches.get_flag("zero"); |
|
|
||
| let reader = BufReader::new(file); | ||
| let lines: Vec<String> = reader.lines().collect::<Result<_, _>>()?; | ||
| let (chosen_regex, algo_based_format) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename algo_based_format to something like is_algo_based_format so that it's obvious it's a bool.
src/uu/hashsum/src/hashsum.rs
Outdated
|
|
||
| let length = match input_length { | ||
| Some(length) => { | ||
| if binary_name == ALGORITHM_OPTIONS_BLAKE2B || binary_name == "b2sum" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binary_name == ALGORITHM_OPTIONS_BLAKE2B is always false because of line 186 ;-)
src/uu/hashsum/src/hashsum.rs
Outdated
| let text_flag: bool = matches.get_flag("text"); | ||
| let binary_flag: bool = matches.get_flag("binary"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let text_flag: bool = matches.get_flag("text"); | |
| let binary_flag: bool = matches.get_flag("binary"); | |
| let text_flag = matches.get_flag("text"); | |
| let binary_flag = matches.get_flag("binary"); |
src/uu/hashsum/src/hashsum.rs
Outdated
| // Determine the appropriate algorithm option to pass | ||
| let algo_option = if algo.name.is_empty() { | ||
| None | ||
| } else { | ||
| Some(algo.name) | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible I'm missing something, but I think algo.name is never empty and thus the algo_option var is unnecessary.
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
Good work :) |
Summary of the change: