Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fs::remove_dir_all: treat internal ENOENT as success #127623

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

lolbinarycat
Copy link
Contributor

@lolbinarycat lolbinarycat commented Jul 11, 2024

fixes #127576

try-job: test-various

@rustbot
Copy link
Collaborator

rustbot commented Jul 11, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cuviper (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 11, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@tbu-
Copy link
Contributor

tbu- commented Jul 12, 2024

I think there's also the fallback implementation which also needs this treatment.

@tgross35
Copy link
Contributor

This changes behavior of a user-visible interface, and has an ACP open at rust-lang/libs-team#410.

@rustbot label +S-waiting-on-ACP +T-libs-api -T-libs

@rustbot rustbot added S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 12, 2024
@cuviper
Copy link
Member

cuviper commented Jul 12, 2024

r? libs-api

@rustbot rustbot assigned m-ou-se and unassigned cuviper Jul 12, 2024
@tgross35
Copy link
Contributor

🤔 does libs-api not go through triagebot.toml? Mara removed herself from the libs rotation recently.

r? libs-api

@rustbot rustbot assigned Amanieu and unassigned m-ou-se Jul 12, 2024
@Amanieu
Copy link
Member

Amanieu commented Jul 12, 2024

🤔 does libs-api not go through triagebot.toml? Mara removed herself from the libs rotation recently.

libs-api uses the team member list directly. triagebot.toml is only used for the initial assignment.

@tgross35
Copy link
Contributor

I stand corrected about the ACP, that is for a superset of this change so this PR alone shouldn't be blocked.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 12, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jul 13, 2024

This PR modifies tests/run-make/. If this PR is trying to port a Makefile
run-make test to use rmake.rs, please update the
run-make port tracking issue
so we can track our progress. You can either modify the tracking issue
directly, or you can comment on the tracking issue and link this PR.

cc @jieyouxu

@rust-log-analyzer

This comment has been minimized.

Comment on lines +2116 to +2094
fn is_enoent(result: &io::Result<()>) -> bool {
if let Err(err) = result
&& matches!(err.raw_os_error(), Some(libc::ENOENT))
{
true
} else {
false
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn is_enoent(result: &io::Result<()>) -> bool {
if let Err(err) = result
&& matches!(err.raw_os_error(), Some(libc::ENOENT))
{
true
} else {
false
}
}
fn ignore_enoent(result: io::Result<()>) -> io::Result<()> {
if let Err(err) = &result && err.raw_os_error() == Some(libc::ENOENT) {
Ok(())
} else {
result
}
}

With this function, you can simply wrap the relevant function calls in ignore_enoent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems clever, but it doesn't work for a few reasons:

  1. inside the loop, we need to continue on ENOENT, not ignore it
  2. there are lots of function calls, and almost all of them need to ignore ENOENT. if someone adds a new unlinkat call in the future, they will probably also need to wrap it, and if they don't, it will introduce a sneaky regression.

using try blocks doesn't have either of these issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. inside the loop, we need to continue on ENOENT, not ignore it

As far as I can tell, this actually works with the function. It'd treat removal failed with ENOENT the same as successful deletion and continue into the next loop.

2. there are lots of function calls, and almost all of them need to ignore ENOENT. if someone adds a new unlinkat call in the future, they will probably also need to wrap it, and if they don't, it will introduce a sneaky regression.

I actually feel the opposite way, it's better if we annotate all the places where we treat ENOENT as not an error rather than treating all possible ENOENT in this function as non-errors.

E.g. the recursive call does not need to ignore ENOENT as far as I can tell, but using try blocks will silence any errors that come from that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. the recursive call does not need to ignore ENOENT as far as I can tell, but using try blocks will silence any errors that come from that.

actually, maybe we should ignore the error in the recursive call, but not ignore any ENOENT errors outside of the for loop, as top-level ENOENT errors should be returned (currently this happens in remove_dir_all_modern).

false
}
}

fn remove_dir_all_recursive(parent_fd: Option<RawFd>, path: &CStr) -> io::Result<()> {
// try opening as directory
let fd = match openat_nofollow_dironly(parent_fd, &path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let fd = match openat_nofollow_dironly(parent_fd, &path) {
let fd = match ignore_enoent(openat_nofollow_dironly(parent_fd, &path)) {

remove_dir_all_recursive(Some(fd), child_name)?;
}
Some(false) => {
cvt(unsafe { unlinkat(fd, child_name.as_ptr(), 0) })?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

                    ignore_enoent(cvt(unsafe { unlinkat(fd, child_name.as_ptr(), 0) }))?;

@@ -0,0 +1,37 @@
use run_make_support::{
Copy link
Member

@jieyouxu jieyouxu Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test slightly concerns me, as in is this a non-deterministic test that usually fails or usually passes, but we can't guarantee it to be reproducible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also need to run through various try jobs, including various cross-compilation scenarios.

Is this test meant to be reproducible across each platform? Are macOS/various linux/Windows behavior different with respect to remove_dir_all?

Also note that some of the CI runners are ran as root, which may cause behavioral differences versus running as non-root. (As in, even if it passes in CI does not guarantee it passes locally as non-root...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to test this reliably it'd require a much larger directory structure and with many more threads. Even then it may not be 100% reliable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we make the test platform specific we may be able to do some LD_PRELOAD fault injection? not sure if that's better^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think that's better, lol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @jieyouxu is suggesting that it may be better to have no test than a potentially flaky test? Because flaky tests can be hard to pin down and diagnose.

To be clear, I'm not advocating to have no tests, I'm just uneasy whenever I see a potentially flakey test that may potentially non-deterministically fail (whether in CI or locally). I'm not going to block this PR on this (I'm not a T-libs reviewer anyway) especially since I don't know of a better alternative way to deterministic test this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also somewhat uncomfortable with this test. I would prefer if the test didn't fail if it couldn't manage to cause a race.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Amanieu is there another status a run-make test can have other than pass or fail? i would like to alert people if the test breaks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. Either the test passes or it fails and blocks CI.

Copy link
Member

@jieyouxu jieyouxu Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also somewhat uncomfortable with this test. I would prefer if the test didn't fail if it couldn't manage to cause a race.

+1 to this. If this test absolutely has to be flaky and non-deterministic and has no other choice, then I would prefer it at least not fail if we couldn't manage to cause a race, because otherwise it can randomly block unrelated PRs on PR CI or full CI. It's still not ideal because then if this regresses, a PR-local try-job might pass, and so might its full CI test run, but then start failing in completely unrelated PRs.

Is there another status a run-make test can have other than pass or fail? i would like to alert people if the test breaks.

One possible idea is to make this a separate test suite called something like run-make-no-blocking which is exercised in a separate non-blocking CI job. This would require some significant testing infra support in terms of bootstrap and compiletest (something like a tool state check job?)... but then the feedback loop can be quite long. And you'd have to gather consensus from T-infra, which I don't think really makes sense for just this one test.

@lolbinarycat
Copy link
Contributor Author

@rustbot review

bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 20, 2024
Rollup of 6 pull requests

Successful merges:

 - rust-lang#127623 (fix: fs::remove_dir_all: treat internal ENOENT as success)
 - rust-lang#128627 (Special case DUMMY_SP to emit line 0/column 0 locations on DWARF platforms.)
 - rust-lang#129187 (bootstrap: fix clean's remove_dir_all implementation)
 - rust-lang#129190 (Added f16 and f128 to tests/ui/consts/const-float-bits-conv.rs)
 - rust-lang#129231 (improve submodule updates)
 - rust-lang#129257 (Allow rust staticlib to work with MSVC's /WHOLEARCHIVE)

r? `@ghost`
`@rustbot` modify labels: rollup
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Aug 20, 2024
…=Amanieu

fix: fs::remove_dir_all: treat internal ENOENT as success

fixes rust-lang#127576

try-job: test-various
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 20, 2024
Rollup of 6 pull requests

Successful merges:

 - rust-lang#127623 (fix: fs::remove_dir_all: treat internal ENOENT as success)
 - rust-lang#128627 (Special case DUMMY_SP to emit line 0/column 0 locations on DWARF platforms.)
 - rust-lang#129187 (bootstrap: fix clean's remove_dir_all implementation)
 - rust-lang#129190 (Added f16 and f128 to tests/ui/consts/const-float-bits-conv.rs)
 - rust-lang#129231 (improve submodule updates)
 - rust-lang#129284 (rustdoc: animate the `:target` highlight)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 20, 2024
Rollup of 6 pull requests

Successful merges:

 - rust-lang#127623 (fix: fs::remove_dir_all: treat internal ENOENT as success)
 - rust-lang#128627 (Special case DUMMY_SP to emit line 0/column 0 locations on DWARF platforms.)
 - rust-lang#129187 (bootstrap: fix clean's remove_dir_all implementation)
 - rust-lang#129190 (Added f16 and f128 to tests/ui/consts/const-float-bits-conv.rs)
 - rust-lang#129231 (improve submodule updates)
 - rust-lang#129284 (rustdoc: animate the `:target` highlight)

r? `@ghost`
`@rustbot` modify labels: rollup
@jieyouxu
Copy link
Member

Failed in #129304 (comment)

command: "C:\\a\\rust\\rust\\build\\x86_64-pc-windows-msvc\\test\\run-make\\remove-dir-all-race\\rmake.exe"
--- stderr -------------------------------
thread '<unnamed>' panicked at C:\a\rust\rust\tests\run-make\remove-dir-all-race\rmake.rs:20:45:
called `Result::unwrap()` on an `Err` value: Os { code: 5, kind: PermissionDenied, message: "Access is denied." }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Not sure if it's due to this PR or if it's due to some compiletest/bootstrap remove_dir_all issues.

@jieyouxu
Copy link
Member

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 20, 2024
@ChrisDenton
Copy link
Member

ChrisDenton commented Aug 20, 2024

Not sure if it's due to this PR or if it's due to some compiletest/bootstrap remove_dir_all issues.

This is likely to be a pre-existing issue with the remove_dir_all impl. It was designed to be safe against concurrent access but it's not necessarily fully robust. I can look into but I'd suggest marking the test as ignore windows for now and merging because I suspect the fix will be non-trivial.

fixes rust-lang#127576

windows implementation still needs some work
@lolbinarycat
Copy link
Contributor Author

@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 22, 2024
@tgross35
Copy link
Contributor

@bors r=Amanieu

@bors
Copy link
Collaborator

bors commented Aug 22, 2024

📌 Commit 736f773 has been approved by Amanieu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 22, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 23, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#127623 (fix: fs::remove_dir_all: treat internal ENOENT as success)
 - rust-lang#128876 (Ship MinGW-w64 runtime DLLs along with `rust-lld.exe` for `-pc-windows-gnu` targets)
 - rust-lang#129055 (Migrate `x86_64-fortanix-unknown-sgx-lvi` `run-make` test to rmake)
 - rust-lang#129386 (Use a LocalDefId in ResolvedArg.)
 - rust-lang#129400 (Update `compiler_builtins` to `0.1.120`)
 - rust-lang#129414 (Fix extern crates not being hidden with `doc(hidden)`)
 - rust-lang#129417 (Don't trigger refinement lint if predicates reference errors)
 - rust-lang#129433 (Fix a missing import in a doc in run-make-support)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 23, 2024
…iaskrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#127623 (fix: fs::remove_dir_all: treat internal ENOENT as success)
 - rust-lang#128876 (Ship MinGW-w64 runtime DLLs along with `rust-lld.exe` for `-pc-windows-gnu` targets)
 - rust-lang#129055 (Migrate `x86_64-fortanix-unknown-sgx-lvi` `run-make` test to rmake)
 - rust-lang#129386 (Use a LocalDefId in ResolvedArg.)
 - rust-lang#129400 (Update `compiler_builtins` to `0.1.120`)
 - rust-lang#129414 (Fix extern crates not being hidden with `doc(hidden)`)
 - rust-lang#129417 (Don't trigger refinement lint if predicates reference errors)
 - rust-lang#129433 (Fix a missing import in a doc in run-make-support)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 370b326 into rust-lang:master Aug 23, 2024
6 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Aug 23, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Aug 23, 2024
Rollup merge of rust-lang#127623 - lolbinarycat:fix_remove_dir_all, r=Amanieu

fix: fs::remove_dir_all: treat internal ENOENT as success

fixes rust-lang#127576

try-job: test-various
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-run-make Area: port run-make Makefiles to rmake.rs O-solid Operating System: SOLID O-unix Operating system: Unix-like O-wasi Operating system: Wasi, Webassembly System Interface O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::fs::remove_dir_all fails if any of the intermediate file deletions fails with ENOENT