Skip to content

Conversation

@Qubasa
Copy link
Contributor

@Qubasa Qubasa commented Sep 10, 2025

Filesystem operation can return EAGAIN to tell the applicaiton to try again later.

This PR implements a retry logic to filesystem calls with an exponential backoff. Meaning if a fs. function failed with EAGAIN it will retry up to 15 times with increasing wait times inbetween, the maximum wait time being 300 miliseconds.

This error seems to be raised often on ZFS.

This PR fixes

I successfully tested this PR on ZFS with Nix for the NuschtOS project https://github.com/NuschtOS/search

@Qubasa Qubasa requested a review from zkochan as a code owner September 10, 2025 18:52
@welcome
Copy link

welcome bot commented Sep 10, 2025

💖 Thanks for opening this pull request! 💖
Please be patient and we will get back to you as soon as we can.

@Qubasa Qubasa force-pushed the eagain_fix branch 2 times, most recently from df6f811 to b74a8b0 Compare September 11, 2025 13:32
@Qubasa Qubasa changed the title fix: retry after fs.copyFileSync fails with EAGAIN error. fix: retry filesystem operations on EAGAIN Sep 11, 2025
@zkochan
Copy link
Member

zkochan commented Sep 11, 2025

Doesn't https://www.npmjs.com/package/graceful-fs already do the retries? Which we use for these functions.

@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

@zkochan I just looked into the libraries source code https://github.com/isaacs/node-graceful-fs/blob/main/graceful-fs.js
And there are two issues with that library.

the first one being it only patches the async versions of the fs.copyFile functions as far as I can see, and not the fs.copyFileSync versions
and the second one is that it doesn't handle EAGAIN.

@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

@zkochan Could you help me out in fixing the CI? I can't see the error messages causing the failure and I don't seem to be able to run the tests locally on a clean branch without failing. I tried running pnpm run test-branch

@zkochan
Copy link
Member

zkochan commented Sep 11, 2025

What about fs-extra? Does that retry it?

@zkochan
Copy link
Member

zkochan commented Sep 11, 2025

The tests are failing in @pnpm/fs.indexed-pkg-importer, which has a lot of mocking. You probably need to update the mocks of some fs functions.

@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

@zkochan No sadly fs-extra does not handle EAGAIN :(

@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

The tests are failing in @pnpm/fs.indexed-pkg-importer, which has a lot of mocking. You probably need to update the mocks of some fs functions.

How can I run these specific tests?

@zkochan
Copy link
Member

zkochan commented Sep 11, 2025

pnpm -F @pnpm/fs.indexed-pkg-importer test

@Qubasa Qubasa force-pushed the eagain_fix branch 2 times, most recently from 2b4c4dc to 3c35875 Compare September 11, 2025 17:03
@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

@zkochan Thanks! The tests should pass now!

@Qubasa Qubasa force-pushed the eagain_fix branch 2 times, most recently from d0d259a to 6f56fa0 Compare September 11, 2025 22:58
@Qubasa
Copy link
Contributor Author

Qubasa commented Sep 11, 2025

@zkochan Wohps my bad, there was type lint ignore that got removed, but now it should pass ;)

Comment on lines 8 to 10
const linkSyncWithRetry = withEagainRetry(fs.linkSync)
const copyFileWithRetry = withEagainRetry(fs.copyFileSync)
const mkdirSyncWithRetry = withEagainRetry(fs.mkdirSync)
Copy link
Member

@zkochan zkochan Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just moving these to graceful-fs exporting the functions with retries by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zkochan Good point, changed it accordingly.

filesystem operations can raise EAGAIN to tell the application to try
again later. This is especially often the case under ZFS.

fix: move wrapped functions to graceful-fs directly
export const mkdirSyncWithRetry = withEagainRetry(fs.mkdirSync)
export const writeFileWithRetry = withEagainRetry(fs.writeFileSync)
export const linkSyncWithRetry = withEagainRetry(fs.linkSync)
export const copyFileWithRetry = withEagainRetry(fs.copyFileSync)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not

Suggested change
export const copyFileWithRetry = withEagainRetry(fs.copyFileSync)
export const copyFileWithRetry = withEagainRetry(gfs.copyFileSync)

which has some retries as well, which may be useful


export default { // eslint-disable-line
copyFile: promisify(gfs.copyFile),
copyFileSync: gfs.copyFileSync,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just replace this. No need to have copyFileWithRetry separately.

@zkochan zkochan merged commit 9b9faa5 into pnpm:main Sep 29, 2025
10 checks passed
@welcome
Copy link

welcome bot commented Sep 29, 2025

Congrats on merging your first pull request! 🎉🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants