cksum/hashsum: refactor the common code. #6431

sylvestre · 2024-05-25T07:39:10Z

Summary of the change:

Move the common code into checksum
Create a structure HashAlgorithm to handle the algorithm (instead of the 3 variables)
Use the same function for cksum & hashsum for --check (perform_checksum_validation)
Use the same for function for the hash generation (digest_reader)
Add unit tests
Add integration tests
Fix some incorrect tests

sylvestre · 2024-05-25T07:51:30Z

Sorry for the big commit but as you can imagine, it wasn't possible to split it :(

github-actions · 2024-05-25T09:14:35Z

GNU testsuite comparison:

GNU test failed: tests/cksum/md5sum. tests/cksum/md5sum is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/md5sum-bsd. tests/cksum/md5sum-bsd is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/sha1sum. tests/cksum/sha1sum is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/tail/inotify-dir-recreate (fails in this run but passes in the 'main' branch)

Cargo.lock

tests/by-util/test_cksum.rs

tests/by-util/test_hashsum.rs

github-actions · 2024-05-25T22:57:13Z

GNU testsuite comparison:

GNU test failed: tests/cksum/md5sum. tests/cksum/md5sum is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/md5sum-bsd. tests/cksum/md5sum-bsd is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/sha1sum. tests/cksum/sha1sum is passing on 'main'. Maybe you have to rebase?

BenWiederhake

I didn't have the time to go though all the code itself, but the tests alone are a great starting point :D

A filename confusion, which might allow an attacker to pass verification by simply staging a "correct" file at the confused filename
A few missed errors about silly flag combinations
It seems like you removed some of the flag parsing code in hashsum? I can't see where some of the flags get stored.
Some forgotten debug prints in test code.

BenWiederhake · 2024-05-26T14:47:42Z

src/uu/cksum/src/cksum.rs

+    pub const STATUS: &str = "status";
+    pub const WARN: &str = "warn";
+    pub const IGNORE_MISSING: &str = "ignore-missing";
+    pub const QUIET: &str = "quiet";


The PR is named "refactor the common code" and you implement 4 new features? XD

yeah, i had to :(
because one of them didn't implemented the option of the other

tests/by-util/test_cksum.rs

tests/by-util/test_hashsum.rs

BenWiederhake

(Removing the "changes requested" flag due to administrative reasons, even though I'm still requesting the above changes.)

github-actions · 2024-05-26T16:45:42Z

GNU testsuite comparison:

GNU test failed: tests/cksum/cksum-c. tests/cksum/cksum-c is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/md5sum. tests/cksum/md5sum is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/cksum/md5sum-bsd. tests/cksum/md5sum-bsd is passing on 'main'. Maybe you have to rebase?

github-actions · 2024-05-28T23:53:15Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

Summary of the change: * Move the common code into checksum * Create a structure HashAlgorithm to handle the algorithm (instead of the 3 variables) * Use the same function for cksum & hashsum for --check (perform_checksum_validation) * Use the same for function for the hash generation (digest_reader) * Add unit tests * Add integration tests * Fix some incorrect tests

…different algo

We have 3 different kinds of input: * "algo (filename) = checksum" example: `BLAKE2 (a) = bedfbb90d858c2d67b7ee8f7523be3d3b54004ef9e4f02f2ad79a1d05bfdfe49b81e3c92ebf99b504102b6bf003fa342587f5b3124c205f55204e8c4b4ce7d7c` * "checksum filename" example: `60b725f10c9c85c70d97880dfe8191b3 a` * "checksum filename" example: `60b725f10c9c85c70d97880dfe8191b3 a` These algo/regexp are tricky as files can be called "a, " b", " ", or "*c". We look at the first time to analyze the kind of input and reuse the same regexp then.

github-actions · 2024-05-29T07:39:07Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

sylvestre · 2024-06-01T08:36:59Z

@cakebaker it is ready for review. :)
thanks

github-actions · 2024-06-02T19:46:56Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

cakebaker · 2024-06-03T07:07:54Z

src/uucore/src/lib/features/checksum.rs

+const ALGO_BASED_REGEX: &str = r"^\s*\\?(?P<algo>(?:[A-Z0-9]+|BLAKE2b))(?:-(?P<bits>\d+))?\s?\((?P<filename>.*)\)\s*=\s*(?P<checksum>[a-fA-F0-9]+)$";
+const DOUBLE_SPACE_REGEX: &str = r"^(?P<checksum>[a-fA-F0-9]+)\s{2}(?P<filename>.*)$";
+
+// In this case, we ignore the *
+const SINGLE_SPACE_REGEX: &str = r"^(?P<checksum>[a-fA-F0-9]+)\s(?P<filename>\*?.*)$";
+
+fn get_filename_for_output(filename: &OsStr, input_is_stdin: bool) -> String {
+    if input_is_stdin {
+        "standard input"
+    } else {
+        filename.to_str().unwrap()
+    }
+    .maybe_quote()
+    .to_string()
+}


I would swap the consts and the get_filename_for_output function so that the consts are next to the determine_regex function.

cakebaker · 2024-06-03T07:21:32Z

src/uucore/src/lib/features/checksum.rs

+    filename: &OsStr,
+    input_is_stdin: bool,
+    lines: &[String],
+) -> UResult<(Regex, bool)> {


I'm not a fan of this return type and you already have to workaround it in the perform_checksum_validation function ;-) But I think fixing it is something for a future PR.

cakebaker · 2024-06-03T08:14:58Z

src/uucore/src/lib/features/checksum.rs

+#[allow(clippy::too_many_arguments)]
+#[allow(clippy::cognitive_complexity)]


Very true :)

yeah, i will improve this in a future PR :)

src/uucore/src/lib/features/checksum.rs

cakebaker · 2024-06-03T08:37:03Z

src/uucore/src/lib/features/checksum.rs

+                            Some(Some(bits_value / 8))
+                        } else {
+                            properly_formatted = false;
+                            None // Return None to signal a parsing or divisibility issue


I think in case of a parse issue you don't reach this code.

Suggested change

None // Return None to signal a parsing or divisibility issue

None // Return None to signal a divisibility issue

cakebaker · 2024-06-03T08:46:35Z

src/uucore/src/lib/features/checksum.rs

+                    // When a specific algorithm name is input, use it and default bits to None
+                    (a.to_lowercase(), length_input)


I think the last part of the comment, "and default bits to None", is incorrect.

cakebaker · 2024-06-03T08:53:59Z

src/uucore/src/lib/features/checksum.rs

+                util_name(),
+                filename_input.maybe_quote(),
+            );
+            //skip_summary = true;


Suggested change

//skip_summary = true;

cakebaker · 2024-06-03T08:57:06Z

src/uucore/src/lib/features/checksum.rs

+    }
+}
+
+/// Calculates the length of the digest for the given algorithm.


Suggested change

/// Calculates the length of the digest for the given algorithm.

/// Calculates the length of the digest.

cakebaker · 2024-06-03T09:07:14Z

src/uu/hashsum/src/hashsum.rs

+    //check: bool,
    tag: bool,
    nonames: bool,
-    status: bool,
-    quiet: bool,
-    strict: bool,
-    warn: bool,
+    //status: bool,
+    //quiet: bool,
+    //strict: bool,
+    //warn: bool,


What's the reason for commenting them out instead of removing?

because i will implement them next

cakebaker · 2024-06-03T09:11:12Z

src/uu/hashsum/src/hashsum.rs

+    //let strict = matches.get_flag("strict");
    let warn = matches.get_flag("warn") && !status;
-    let zero = matches.get_flag("zero");
+    let zero: bool = matches.get_flag("zero");


Suggested change

let zero: bool = matches.get_flag("zero");

let zero = matches.get_flag("zero");

cakebaker · 2024-06-03T09:25:11Z

src/uucore/src/lib/features/checksum.rs

+
+        let reader = BufReader::new(file);
+        let lines: Vec<String> = reader.lines().collect::<Result<_, _>>()?;
+        let (chosen_regex, algo_based_format) =


I would rename algo_based_format to something like is_algo_based_format so that it's obvious it's a bool.

cakebaker · 2024-06-03T12:38:22Z

src/uu/hashsum/src/hashsum.rs

+
+    let length = match input_length {
+        Some(length) => {
+            if binary_name == ALGORITHM_OPTIONS_BLAKE2B || binary_name == "b2sum" {


binary_name == ALGORITHM_OPTIONS_BLAKE2B is always false because of line 186 ;-)

cakebaker · 2024-06-03T12:42:55Z

src/uu/hashsum/src/hashsum.rs

+        let text_flag: bool = matches.get_flag("text");
+        let binary_flag: bool = matches.get_flag("binary");


Suggested change

let text_flag: bool = matches.get_flag("text");

let binary_flag: bool = matches.get_flag("binary");

let text_flag = matches.get_flag("text");

let binary_flag = matches.get_flag("binary");

src/uu/hashsum/src/hashsum.rs

cakebaker · 2024-06-03T13:10:56Z

src/uu/hashsum/src/hashsum.rs

+        // Determine the appropriate algorithm option to pass
+        let algo_option = if algo.name.is_empty() {
+            None
+        } else {
+            Some(algo.name)
+        };


It's possible I'm missing something, but I think algo.name is never empty and thus the algo_option var is unnecessary.

src/uu/cksum/src/cksum.rs

github-actions · 2024-06-03T17:12:06Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

github-actions · 2024-06-03T19:20:55Z

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)

cakebaker · 2024-06-04T07:06:46Z

Good work :)

sylvestre requested review from BenWiederhake and cakebaker May 25, 2024 07:50

sylvestre marked this pull request as draft May 25, 2024 11:00

cakebaker reviewed May 25, 2024

View reviewed changes

Cargo.lock Show resolved Hide resolved

cakebaker reviewed May 25, 2024

View reviewed changes

tests/by-util/test_cksum.rs Outdated Show resolved Hide resolved

cakebaker reviewed May 25, 2024

View reviewed changes

tests/by-util/test_hashsum.rs Outdated Show resolved Hide resolved

cakebaker reviewed May 25, 2024

View reviewed changes

tests/by-util/test_hashsum.rs Outdated Show resolved Hide resolved

cakebaker reviewed May 25, 2024

View reviewed changes

tests/by-util/test_hashsum.rs Outdated Show resolved Hide resolved

BenWiederhake mentioned this pull request May 26, 2024

hashsum: Should not accept --strict without --check #6436

Closed

BenWiederhake requested changes May 26, 2024

View reviewed changes

BenWiederhake approved these changes May 26, 2024

View reviewed changes

sylvestre added 10 commits May 29, 2024 09:08

cksum/hashsum: factor the error structure and use it more

0882eea

hashsum: handle the case when md5sum is used but the file contains a …

89b7a1a

…different algo

hashsum: Implement the quiet mode

6acc8e6

cksum/hashsum: try to detect the format faster the first line

84d90fc

cksum/hashsum: manage the '*' start correctly

1cf6700

cksum/hashsum: improve the display of errors

bf8b0df

cksum/hashsum: create a new error type & use it

193a81b

cksum/hashsum: fix clippy warnings

6e06c2a

sylvestre marked this pull request as ready for review June 1, 2024 08:36

cksum/hashsum: add some words in the spell skip

b1b6f28

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uucore/src/lib/features/checksum.rs Outdated Show resolved Hide resolved

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uu/hashsum/src/hashsum.rs Outdated Show resolved Hide resolved

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uu/cksum/src/cksum.rs Outdated Show resolved Hide resolved

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uu/cksum/src/cksum.rs Outdated Show resolved Hide resolved

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uu/cksum/src/cksum.rs Outdated Show resolved Hide resolved

cakebaker reviewed Jun 3, 2024

View reviewed changes

src/uu/cksum/src/cksum.rs Outdated Show resolved Hide resolved

cksum/hashsum: improve the tests and wording

1cbb4d9

sylvestre requested a review from cakebaker June 3, 2024 18:53

cakebaker merged commit f56e121 into uutils:main Jun 4, 2024

sylvestre deleted the refactor-hashsum-cksum branch June 4, 2024 19:50

BenWiederhake mentioned this pull request Jun 10, 2024

hashsum: should refuse to run with contradictory bitlengths #6459

Open

BrewTestBot mentioned this pull request Jun 23, 2024

uutils-coreutils 0.0.27 Homebrew/homebrew-core#175400

Merged

moonfruit mentioned this pull request Jun 24, 2024

uutils-selected 0.0.27 moonfruit/homebrew-tap#114

Merged

		#[allow(clippy::too_many_arguments)]
		#[allow(clippy::cognitive_complexity)]

	None // Return None to signal a parsing or divisibility issue
	None // Return None to signal a divisibility issue

		// When a specific algorithm name is input, use it and default bits to None
		(a.to_lowercase(), length_input)

	/// Calculates the length of the digest for the given algorithm.
	/// Calculates the length of the digest.

	let zero: bool = matches.get_flag("zero");
	let zero = matches.get_flag("zero");

		let text_flag: bool = matches.get_flag("text");
		let binary_flag: bool = matches.get_flag("binary");

Uh oh!

cksum/hashsum: refactor the common code. #6431

cksum/hashsum: refactor the common code. #6431

Uh oh!

Conversation

sylvestre commented May 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sylvestre commented May 25, 2024

Uh oh!

github-actions bot commented May 25, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 25, 2024

Uh oh!

BenWiederhake left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenWiederhake left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 26, 2024

Uh oh!

github-actions bot commented May 28, 2024

Uh oh!

github-actions bot commented May 29, 2024

Uh oh!

sylvestre commented Jun 1, 2024

Uh oh!

github-actions bot commented Jun 2, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cakebaker Jun 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sylvestre commented May 25, 2024 •

edited

Loading

cakebaker Jun 3, 2024 •

edited

Loading