Skip to content

Conversation

@Joining7943
Copy link
Contributor

@Joining7943 Joining7943 commented Jan 22, 2023

Like discussed in #4261, this pr restructures UCommand to run a command in a shell context (sh -c on unix systems and cmd /C on windows systems) per default.

A little bit of a refactoring is needed, to not break the other functions in UCommand like arg, args etc. and store the arguments and environment variables in UCommand itself instead of delegating them to the underlying std::process::Command.

Other fixes/changes in UCommand:

  • Remove the raw field and build the std::process::Command in run_no_wait or more specifically UCommand::build instead.
  • Use OsStrings and PathBufs for paths in UCommand instead of converting to str or String back and forth.
  • Remove the comm_string field and implement Display for UCommand.
  • Add a new public method UCommand::current_dir to set the working directory of the UCommand
  • Add a new public method keep_env to stop the default behavior of clearing the environment.
  • The UCommand::args method generated the wrong assertion message MULTIPLE_STDIN_MEANINGLESS instead of ALREADY_RUN. The assertions for self.has_run in arg, args, env were removed and only the one in run_no_wait is kept.
  • Rename with_limit to just limit to integrate itself better in the general naming scheme in UCommand
  • Bump the prlimit version to be able to apply the limits on android platforms too.

Changes in TestScenario:

  • Refactor methods into a simpler structure
  • Remove the methods ucmd_keepenv and cmd_keepenv. Make use of UCommand::keep_env instead.

Changes in the util module in general:

  • Introduce a global constant TESTS_BINARY which holds the platform and profile specific path to the coreutils binary, so it can be used in all tests.

Short summary of tests that needed to be adjusted:

  • In general use UCommand::keep_env instead of the removed TestScenario::ucmd_keepenv and TestScenario::cmd_keepenv
  • tests/test: test_invalid_utf8_integer_compare now uses UCommand::args instead of UCommand::raw::arg
  • tests/chmod: Use Display of UCommand instead of Debug of UCommand.
  • tests/cp: Do not use the UCommand::new method to set the working directy. Use TestScenario::ucmd instead, together with UCommand::current_dir to set the working.directory.
  • tests/pwd: Use UCommand::current_dir instead of UCommand::raw::current_dir
  • tests/test: Do not use the UCommand::raw field but UCommand::args with OsStrs

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/misc/timeout is no longer failing!

1 similar comment
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/misc/timeout is no longer failing!

@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from db5e0bf to 39546ea Compare January 24, 2023 18:41
@Joining7943 Joining7943 changed the title tests/util: Add possibility to run UCommand in a shell platform independently tests/util: Small Refactor/Fixes of UCommand and add method to run a UCommand in a shell platform independently Jan 24, 2023
@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 39546ea to 8454c55 Compare January 24, 2023 19:29
@Joining7943 Joining7943 marked this pull request as ready for review January 24, 2023 20:10
@Joining7943
Copy link
Contributor Author

This pr is ready for review but not for merge. The prs, this branch is based on need to be merged first and I'll do a final squash.

@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from d135169 to 3dcc1c2 Compare January 25, 2023 19:32
@Joining7943
Copy link
Contributor Author

I have to add that I'm not very satisfied with the UCommand::use_shell method. Considering that UCommand has no proper default for bin_path (coreutils without a util_name doesn't make much sense) I found it inviting to push this pr even further and use a shell as default if no bin_path was given.

This approach makes the use_shell method redundant, pushes the design of UCommand into a more flexible builder with defaults fitting for our use cases and UCommand better usable, even without a TestScenario. However, this approach also introduces some additional refactorings and changes of UCommand and also TestScenario. Please tell me, if you're ok with reviewing the pr with these additional changes. If not, I can break out these additional changes into a following pr.

@jtracey
Copy link
Contributor

jtracey commented Jan 29, 2023

FWIW, the lack of Android support for prlimit seems like an upstream bug. It's a wrapper for kernel functionality, and the kernel is the same. I've confirmed that the syscall does work on Android just fine, here it is in rust's aarch64-linux-android libc.

(Edit to add: IMO, this project should have a comment any time there is one of target_os = "linux" or target_os = "android" (and not the other) describing why, but that's probably a discussion for another issue.)

@jtracey
Copy link
Contributor

jtracey commented Jan 29, 2023

Filed a PR upstream for the Android bug: Nugine/rlimit/pull/46

@Joining7943
Copy link
Contributor Author

Filed a PR upstream for the Android bug: Nugine/rlimit#46

That's great! Thanks. I'll integrate the changes of the new release of rlimit.

@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 9c93eb3 to 8249ff3 Compare January 30, 2023 08:18
@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 8249ff3 to 83704ab Compare February 15, 2023 19:50
Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is close to done. Just a couple of points:

  1. Could you reconsider what the public API should be (I'd like it to be as small as possible)?
  2. I'd like the git history to be a bit more compressed.
  3. There are some TODOs left in the code that should be addressed.

Comment on lines 1240 to 1232
/// Set the execution binary.
///
/// Make sure the binary found at this path is executable. It's safest to provide the
/// canonicalized path instead of just the name of the executable, since path resolution is not
/// guaranteed to work on all platforms.
pub fn bin_path<T>(&mut self, bin_path: T) -> &mut Self
where
T: Into<PathBuf>,
{
self.bin_path = Some(bin_path.into());
self
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these setters necessary (also util, temp_dir)? Or can we maybe not make them pub?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util could be private, with_util and from_test_scenario should be sufficient. As far as I can see, temp_dir could be private too if you think it's not useful to be pub. Same for bin_path (TestScenario::cmd provides the same functionality but then the only way to set a custom binary would be via TestScenario). current_dir is actually useful, since it removes the need for access to the raw std::process::Command. Compared to the public api of Command this is the last missing method and it's in use in test_cp and test_pwd.

We could reduce the public api of TestScenario too. ucmd_keepenv and cmd_keepenv are much less useful, since there's keep_env in UCommand, now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great!

@Joining7943
Copy link
Contributor Author

Joining7943 commented Feb 16, 2023

  1. I'd like the git history to be a bit more compressed.
  2. There are some TODOs left in the code that should be addressed.

No problem, I'll squash the commits down to a minimum in the end. I'm also cleaning up the TODOs. What do you think about renaming with_limits to just limits? It's not the most important thing, but in my opinion it's less confusing, since the with prefix actually implies that the return value is a new UCommand.

@tertsdiepraam
Copy link
Member

I like removing the with_, keeps it more consistent with the rest of the functions too. Shouldn't it be limit though since you're adding a single limit to the Vec?

@Joining7943
Copy link
Contributor Author

I like removing the with_, keeps it more consistent with the rest of the functions too.

That's another good point!

Shouldn't it be limit though since you're adding a single limit to the Vec?

Right, it should be just limit.

@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 83704ab to 7ce9e50 Compare February 16, 2023 23:23
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/misc/timeout is no longer failing!

@tertsdiepraam
Copy link
Member

Just ping me when it's ready for another review!

…ce constant TESTS_BINARY.

Summary of changes in tests/util:

* Introduce global constant TESTS_BINARY holding the path to `coreutils`
* Implement running an arbitrary command in a sh or cmd shell per default.
* Implement std::fmt::Display for UCommand.
* Change usages of UCommand::util_name from &Option<OsStr> to Option<OsStr>
* `UCommand::new_from_tmp`: Use OsStr directly instead of TempDir -> String -> OsStr
* Collect arguments in `UCommand::args` field
* Build environment variables in `UCommand` itself instead of `std::process::Command`
* Move building of std::process:Command from fields in UCommand to own method `build`
* Remove assertions of UCommand::has_run in arg, args and env.

Summary of changes in tests/by-util:

* Remove usages of UCommand::raw. Fix tests to use UCommand::to_string.
* test_test: Adjust test_invalid_utf8_integer_compare to use `UCommand::args`
* test_chmod: run_single_test
* test_pwd: symlinked_env
* test_cp:
    Fix the usage of &Option<OsStr> in `test_src_base_dot`
    Refactor test_src_base_dot to not use UCommand::new but ts.ucmd() instead
@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 7ce9e50 to 22b18ed Compare February 17, 2023 16:50
@Joining7943
Copy link
Contributor Author

@tertsdiepraam ok, deal :) ping.

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/misc/tee is no longer failing!
Congrats! The gnu test tests/misc/timeout is no longer failing!

Comment on lines 1362 to 1383
// neither `bin_path` nor `util_name` was set so we apply the default to run the arguments
// in a shell platform specific
Copy link
Member

@tertsdiepraam tertsdiepraam Feb 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be unfinished? Also maybe we want to panic if neither was set? I'm also wondering why the bin_path would not be set? Can't we just make it a PathBuf and force it to be set in any constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is actually finished. It's just a helping comment to guide the quick reader a little bit through the build method.

Also maybe we want to panic if neither was set? I'm also wondering why the bin_path would not be set? Can't we just make it a PathBuf and force it to be set in any constructor?

The build method is built on the concept to collect all the information given by the user up to this call of build and then apply the defaults appropriately. So, bin_path is always set (a few lines later, you'll find another helping comment above the unwrap which points this out). The application of defaults only in build simplifies the building process and keeps UCommand as flexible as possible (also for future changes). I think it's much clearer to apply the defaults in a single, dedicated method instead of spreading the setting of defaults over the init methods and the build method.

I'll place a doc comment above the build method which will describe this strategy in detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this answer your questsions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I understand the comment now. It was the word order that tripped me up. "in a platform specific shell" might be clearer. I understand the builder, but I just think the command name should always be given in the constructor of the builder. If we do that the bin_name does not need to be an option anymore.

Copy link
Member

@tertsdiepraam tertsdiepraam Feb 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would splitting the build method into logically related smaller methods help, too?

No the build method is fine!

In short, we have a good default for a non-existent bin_path, Command doesn't.

Well, that's the thing, I don't want there to be a default, because I don't see the point of doing this. It would be like calling a function but not specifying the function name but only the arguments, wouldn't it? It seems to me that this is a clear user error and we can prevent that error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, if you still disagree it's not a big deal and we can move on.

Copy link
Member

@tertsdiepraam tertsdiepraam Feb 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to give an example. If I understand this PR correctly, this would be valid:

UCommand::new()
    .arg("foo")
    .arg("bar")
    .run();

Under the hood, this would become sh -c foo bar, so that's the same as calling foo bar which the shell option. That's very weird to me and I'd much rather have this be an error (and preferably at compile time). Also remember that flexibility is not really the point here. We're building these tools for a very specific purpose, so we get to tailor it to our (current) preferences. If the requirements change, this will just change with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the code above would be valid. I actually would prefer providing a default if possible over errors. Would renaming new to with_shell (and giving up the new method) resolve your concerns?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, but I just realized there's some additional complexity because there's both bin_name and util_name and I need to think for a bit about how we want those to interact with the shell. There are some fundamental problems with doing this platform-independent shell API and how it interacts with the tests binary.

@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from 22b18ed to efb1ad5 Compare February 18, 2023 17:46
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/rm/rm2. tests/rm/rm2 is passing on 'main'. Maybe you have to rebase?

Summary of changes in UCommand:
* Extend UCommand by builder methods and simplify methods in TestScenario
* Simplify code structures where possible. Add documentation.
* Store bin_path as PathBuf and util_name as String in all structs
* Remove UCommand::util and make bin_path, temp_dir private
* Rename UCommand::with_limit -> UCommand::limit

Summary of changes in TestScenario:
* Rename some parameters in TestScenario methods to be more descriptive
* Remove ucmd_keepenv, cmd_keepenv from TestScenario. Use UCommand::keep_env instead.
@Joining7943 Joining7943 force-pushed the tests-util-refactor-ucommand-add-run-in-shell branch from efb1ad5 to 1c230fd Compare February 18, 2023 22:38
@github-actions
Copy link

GNU testsuite comparison:

GNU test failed: tests/rm/rm2. tests/rm/rm2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/tail-2/inotify-dir-recreate. tests/tail-2/inotify-dir-recreate is passing on 'main'. Maybe you have to rebase?

/// Spawns the command, feeds the stdin if any, and returns the
/// child process immediately.
pub fn run_no_wait(&mut self) -> UChild {
assert!(!self.has_run, "{}", ALREADY_RUN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a wrapper now that builds a Command on every run, we don't need to check this anymore do we? That's kind of a cool side effect of doing it this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we do need to check this because currently build mutates self, but it doesn't have to! We can write it in a way that it doesn't do that and then we can make reusable commands, which could be nice in some cases where we test many small variations on one command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about it, too. There's a problem however: Stdio is not cloneable, so every Stdio field we would set explicitely with set_stdin, set_stdout and set_stderr would be gone after the first usage. I'm not sure if this behavior is obvious enough. Another possibility to achieve almost the same would be to add a method like from_other(other: &UCommand, stdin: Option<Stdio>, stdout: ...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's a good point! Nevermind then :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the opposite approach then, could we make build consume self?

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Feb 19, 2023

To clarify some things that I think needs some thought:

  • The shell behaves very differently than other commands. Maybe it shouldn't take multiple arguments but just a single string?
  • The syntax for the shell differs per platform, so maybe the API should reflect that (i.e. have a unix_shell method and a windows_shell method, both only available on their respective platforms)
  • If I create a command with a util I want to test, I want it to automatically use the tests binary, so somehow the shell executable should maybe not replace the tests binary.

For this next step, the most important thing to figure out is the use case of this feature.

Now I do want something like this PR and I think we can use most of the changes you made. I just want to discuss the API further before we get into the implementation further if that makes sense. Also, sorry, I should have realized these things earlier so I could have brought them up earlier.

@Joining7943
Copy link
Contributor Author

I just want to discuss the API further before we get into the implementation further if that makes sense. Also, sorry, I should have realized these things earlier so I could have brought them up earlier.

Sure, no problem.

The shell behaves very differently than other commands. Maybe it shouldn't take multiple arguments but just a single string?

What about with_shell(cmd: ... )? The cmd is applied last to the other args. UCommand::arg and args are then the arguments for the shell itself, if needed. I don't think it's needed very often, but if so the functionality is there.

The syntax for the shell differs per platform, so maybe the API should reflect that (i.e. have a unix_shell method and a windows_shell method, both only available on their respective platforms)

I'm not sure if two different methods would help much around the problem, that we always have to differentiate platform dependent between those methods in the test itself. So, if I'm not wrong the code would alwasy look something like that

#[cfg(unix)]
let command = UCommand::with_shell("echo hello; echo world");
#[cfg(windows)]
let command = UCommand::with_shell("echo hello&echo world");

command.run();

Maybe a single `with_shell(unix_cmd: ..., win_cmd: ... ) method would work?

let command = UCommand::with_shell("echo hello; echo world", "echo hello&echo world").run();

If I create a command with a util I want to test, I want it to automatically use the tests binary, so somehow the shell executable should maybe not replace the tests binary.

Replacing the bin_path that way can't happen, since the shell path is only set if neither bin_path nor util_name was set.

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Feb 19, 2023

What about with_shell(cmd: ... )? The cmd is applied last to the other args. UCommand::arg and args are then the arguments for the shell itself, if needed. I don't think it's needed very often, but if so the functionality is there.

That's possible! Although maybe it should not be a UCommand at all? I could be a different UShell struct that also creates a UChild, because the requirements are so different.

I'm not sure if two different methods would help much around the problem, that we always have to differentiate platform dependent between those methods in the test itself. So, if I'm not wrong the code would always look something like that

Yes it would, but I want to enforce that at the API level. In the API that I proposed it would be:

#[cfg(unix)]
let command = UCommand::unix_shell("echo hello; echo world");
#[cfg(windows)]
let command = UCommand::windows_shell("echo hello&echo world");

And it would be a compile time error if you forget the cfg.

Maybe a single `with_shell(unix_cmd: ..., win_cmd: ... ) method would work?

Nice idea, but we often want to call this in a test for unix only and it doesn't scale well if we want to add other shells (e.g. zsh, fish, nushell, PowerShell, who knows)

Replacing the bin_path that way can't happen, since the shell path is only set if neither bin_path nor util_name was set.

Yes, with an important caveat: that's only true in your current implementation. That's why I want to look at the API first and the implementation after.

More importantly than any of the above, what use cases do you have in mind here?

@Joining7943
Copy link
Contributor Author

More importantly than any of the above, what use cases do you have in mind here?

The use cases are very rare. Maybe we should avoid overcomplicating things? I found 2 cases in tail. What use cases did you have in mind?

@tertsdiepraam
Copy link
Member

Maybe we should avoid overcomplicating things?

That's what I'm trying to do. I don't have particular cases in mind, that's why I'm asking. If it's really that rare that it would only apply to 2 cases in tail then why should we put it in UCommand instead of just making a function in the tail tests?

Maybe some thing like this:

fn shell(cmd: String) -> UCommand {
    if cfg!(unix) {
         UCommand::cmd("sh").arg("-c").arg(cmd)
    } else {
        UCommand::cmd("cmd").arg("/C").arg(cmd)
}

Note that even if it turns out that the shell thing is not useful, the surrounding changes are great on their own in my opinion. Things like the separate keep_env, using PathBuf and OsString more and only building the Command, so I definitely want to keep all of that!

@Joining7943
Copy link
Contributor Author

It started off that way, but you told me you wanted it in UCommand. The occassions where a shell is needed are pretty rare, but now that it's implemented it wouldn't do any harm to keep it. By the way, I found a nice shortcut for setting the arg platform dependent:

UCommand::with_shell()
    .arg(
       #[cfg(unix)]
        "echo hello; echo world",
       #[cfg(windows)]
        "echo hello&echo world",
    )
    .run();

I really like this :)

@tertsdiepraam
Copy link
Member

It started off that way, but you told me you wanted it in UCommand.

Oh yeah I see. Whoops 😄 Now I feel bad for pushing back. I suppose I didn't think about it properly back then. To be fair, though, you haven't made an issue to discuss this feature first. If you start out without the sunk-cost of an implementation, discussions like this one are much easier to have.

I do like it much better with the with_shell name and the Option is an implementation detail anyway.

By the way, I found a nice shortcut for setting the arg platform dependent:

Looks nice, but doesn't solve the issue I had with the API.

I'll think about this for a bit and get back to you.

Copy link
Member

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've decided that this is indeed good enough and your time is probably much better spent on tail instead :) Sorry for the wild ride and being inconsistent. I really appreciate all the work you put into this. Thank you!

@tertsdiepraam tertsdiepraam merged commit 6bd42fd into uutils:main Feb 21, 2023
@Joining7943
Copy link
Contributor Author

Great! Thanks :)

@Joining7943 Joining7943 deleted the tests-util-refactor-ucommand-add-run-in-shell branch February 22, 2023 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants