fix(core): prevent async jobs from getting SIGBUS error under extreme system load#6357
fix(core): prevent async jobs from getting SIGBUS error under extreme system load#6357bluestreak01 merged 3 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@CodeRabbit can you read the PR description and scrutinise the solution |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 6269 @bluestreak01, I've analyzed the PR changes. Here's my detailed scrutiny of the solution: Summary of the FixThe core change replaces the file existence check ( Critical Analysis✅ Positive Aspects
|
[PR Coverage check]😍 pass : 7 / 7 (100.00%) file detail
|
|
@CodeRabbit when table is dropped, table dir will be removed including txn file. Name ending with ~ does not indicate dropped table. Once txn file is created it is never truncated to below header size, it is either bigger or removed entirely. I think race is fully eliminated |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 29757 🏁 Script executed: Length of output: 1520 🏁 Script executed: Length of output: 4050 @bluestreak01, thank you for the clarification! I appreciate you correcting my misunderstanding about the Now I understand the elegance of the solution. The key insight is: Txn files have a strong invariant: They're never in a partial state below header size. They're either:
Given this invariant, the fix makes perfect sense:
The race condition between check and map is eliminated because:
You're right—the race is fully eliminated given these invariants. The enhanced error message with length and errno is also excellent for the rare edge cases that might still occur. Thank you for the detailed explanation! 🎯 ✏️ Learnings added
|
Fixes scenarios like this: