fix: add retry logic for file lock errors (EBUSY, EACCES, EPERM)#1
fix: add retry logic for file lock errors (EBUSY, EACCES, EPERM)#1techfitmaster merged 5 commits intomainfrom
Conversation
Handle file lock errors gracefully when .openclaw directory is synced via cloud storage services (OneDrive, Dropbox, Google Drive, etc.). Retry up to 3 times with exponential backoff before failing. Fixes openclaw#39446
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the stability of file operations by introducing robust retry mechanisms for common file lock errors, which frequently occur with cloud synchronization services. It also streamlines the MiniMax model integration by deprecating and removing an outdated model, ensuring consistency across the application and its documentation. Furthermore, the changes include refinements to the Docker build and testing environment, improving reliability and clarity for development workflows. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several improvements. The main change is the addition of retry logic with backoff for file write operations, which will gracefully handle file lock errors from cloud sync services. This is a solid improvement for robustness.
The PR also includes a number of other changes:
- Removal of the deprecated
MiniMax-M2.5-Lightningmodel across the codebase. - Refinements to the build scripts to control verbosity and reduce duplication.
- Updates to the Dockerfile and testing scripts to improve the containerized testing workflow.
The changes are well-implemented. I have a couple of suggestions for the retry logic in src/infra/json-files.ts to make it even more robust and align it with the description.
| lastError = err as Error; | ||
| const errWithCode = err as { code?: string }; | ||
| if (attempt < MAX_RETRIES - 1 && FILE_LOCK_ERRORS.has(errWithCode.code ?? "")) { | ||
| await new Promise((resolve) => setTimeout(resolve, RETRY_DELAY_MS * (attempt + 1))); |
There was a problem hiding this comment.
The pull request description mentions "exponential backoff", but this implementation uses a linear backoff (100ms, 200ms, etc.). For true exponential backoff, you could use Math.pow(2, attempt). This can be more effective if lock contention lasts longer than a few hundred milliseconds.
| await new Promise((resolve) => setTimeout(resolve, RETRY_DELAY_MS * (attempt + 1))); | |
| await new Promise((resolve) => setTimeout(resolve, RETRY_DELAY_MS * (2 ** attempt))); |
| } finally { | ||
| await fs.rm(tmp, { force: true }).catch(() => undefined); | ||
| } | ||
| throw lastError; |
There was a problem hiding this comment.
If MAX_RETRIES were configured to be 0, this line would be reached with lastError as undefined, causing an undefined value to be thrown. While MAX_RETRIES is currently 3, it's safer to handle this edge case to prevent future issues.
| throw lastError; | |
| throw lastError ?? new Error(`Failed to write file after ${MAX_RETRIES} attempts.`); |
Summary
Handle file lock errors gracefully when
.openclawdirectory is synced via cloud storage services (OneDrive, Dropbox, Google Drive, Baidu Netdisk, etc.).When cloud sync services lock files temporarily during upload/download, OpenClaw Gateway was crashing with unhandled
EBUSY,EACCES, orEPERMerrors.Changes
writeTextAtomicfunction insrc/infra/json-files.tsEBUSY,EACCES, andEPERMerrorsTesting
Related Issue
Fixes openclaw#39446