Skip to content

Use double_fork() for gzip compression#1223

Merged
gc00 merged 4 commits intodmtcp:mainfrom
gc00:double-fork
Oct 28, 2025
Merged

Use double_fork() for gzip compression#1223
gc00 merged 4 commits intodmtcp:mainfrom
gc00:double-fork

Conversation

@gc00
Copy link
Copy Markdown
Contributor

@gc00 gc00 commented Oct 17, 2025

This is required for forked checkpointing. Previously, we forked a gzip, and did a waitpid for it. So, the parent process was stuck in waitpid while the forked process writing to a ckpt image.

By using a new function double_fork(), the code is simpler to follow, 35 lines shorter, and it fixes the waitpid bug for forked checkpointing.

And then the forked checkpointing code uses double_fork() in a later commit. This cleans up the code.

COMMITS:
4fa3ade: If FORKED_CKPT, grandchild renames .tmp to .dmtcp
     (The parent process in dmtcpworker.cpp was renaming, even while the grandchild was still saving the ckpt.)
b15389a: Use double_fork() for gzip compression
f97b275: Use double_fork() if FORKED_CHECKPOINTING
     (Note that this commit has the original process send the temp ckpt filename (*.dmtcp.temp) to the coordinator, and the grandchild currently cannot send the real filename (*.dmtcp) because the coordinator doesn't recognize it as part of the original computation group. An alternative would be for the original process to send the real filename (*.dmtcp) even while the grandchild process continues to append to that file.)
8c53751: Remove some compiler warnings

This is now ready for review. @karya0, I'll trade you a review of your PR for a review of this PR. :-)

Summary by CodeRabbit

  • Refactor

    • Reworked checkpointing internals for more robust fork-based checkpoint handling and signal management.
    • Internal cleanup of file-descriptor iteration and filesystem path handling for steadier behavior.
  • Bug Fixes

    • Corrected checkpoint filename handling when forked checkpointing is enabled so final filenames are delivered consistently.
  • Chores

    • Removed unused local variables in a test to simplify code.

@gc00 gc00 added the bug label Oct 17, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 17, 2025

Walkthrough

Refactored forked checkpointing to use a type-safe enum and a new double_fork() helper, reorganized SIGCHLD handling and delivery, updated DmtcpWorker::postCheckpoint to support deferred rename under forked mode, and made several local variable type adjustments for consistency.

Changes

Cohort / File(s) Summary
Forked checkpoint status refactor
src/ckptserializer.cpp
Added forked_rc_t enum (FORKED_CKPT_FAILED, FORKED_CKPT_PARENT, FORKED_CKPT_GRANDCHILD); replaced int-based fork status with enum; added double_fork() to encapsulate two-stage fork; renamed and simplified SIGCHLD handler functions (trivial_sigchld_handler, set_trivial_sigchld_action); replaced restore/wait flow with allow_delivery_of_one_sigchld(); updated callers to use enum returns.
Coordinator filename handling for forked ckpt
src/dmtcpworker.cpp
In DmtcpWorker::postCheckpoint, added branch controlled by ENV_VAR_FORKED_CKPT to defer final rename in grandchild: swap temp/final names, send checkpoint filename, then restore original filename; non-forked path unchanged.
Filesystem type alignment
jalib/jfilesystem.cpp
Changed local variables len, lastSlash, and loop index types from size_t to int in Filesystem::DirName().
Loop variable type alignment
src/wrappers.cpp
Adjusted close_range loop variable to unsigned int to match vector element types.
Test cleanup
test/pthread6.c
Removed unused local declarations (pthread_t thread; int *arg;) from main.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant open_ckpt as open_ckpt_to_write()
    participant double_fork as double_fork()
    participant Parent
    participant Child
    participant Grandchild

    Caller->>open_ckpt: initiate fork-based checkpoint
    open_ckpt->>double_fork: request two-stage fork

    Note over double_fork: First fork -> creates Child
    double_fork->>Child: second fork

    alt Grandchild path
        double_fork-->>Grandchild: return FORKED_CKPT_GRANDCHILD
        Grandchild->>open_ckpt: perform checkpoint write/rename (grandchild responsibility)
    else Parent path
        double_fork-->>Parent: return FORKED_CKPT_PARENT
        Parent->>open_ckpt: continue, wait for SIGCHLD delivery handling
    else Failure
        double_fork-->>open_ckpt: return FORKED_CKPT_FAILED
        open_ckpt->>Caller: error handling
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • karya0
  • xuyao0127
  • aayushi363

Poem

🐰
I hopped through forks and signal streams,
Enums snugly tucked in my dreams.
A double-fork waltz, tidy and spry,
Filenames swapped as I scurry by —
Checkpoints safe beneath moonlit sky. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Use double_fork() for gzip compression" is directly related to the main changes in the changeset. The changes introduce a new double_fork() function and refactor the forked checkpointing mechanism to use this function instead of the previous single-fork approach with blocking waitpid calls. The title specifically references the primary architectural change—the introduction and usage of double_fork()—which is implemented across multiple files (notably in ckptserializer.cpp and dmtcpworker.cpp). The title is clear, specific, and accurately captures the core objective described in the PR summary: resolving the waitpid blocking issue by using the new double_fork mechanism for compression-related operations.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c53751 and e171a44.

📒 Files selected for processing (3)
  • jalib/jfilesystem.cpp (1 hunks)
  • src/wrappers.cpp (1 hunks)
  • test/pthread6.c (0 hunks)
💤 Files with no reviewable changes (1)
  • test/pthread6.c
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/wrappers.cpp
  • jalib/jfilesystem.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gc00 gc00 force-pushed the double-fork branch 3 times, most recently from 1ac1617 to 8b26c5d Compare October 19, 2025 07:12
gc00 added 3 commits October 19, 2025 03:36
 * This fixes a bug, when using forked checkpointing.
   Previously, we forked a gzip, and did a waitpid for it.
   So, the parent process was stuck in waitpid while the
     forked process writing to a ckpt image.
@gc00 gc00 marked this pull request as ready for review October 20, 2025 03:53
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/ckptserializer.cpp (1)

406-417: After rename in GRANDCHILD, notify coordinator (currently missing).

Grandchild renames .tmp -> final, then _exit(0). To keep coordinator state consistent—especially if parent sent .tmp earlier—send the final filename here.

-    JASSERT(rename(ProcessInfo::instance().getTempCkptFilename().c_str(),
-                   ProcessInfo::instance().getCkptFilename().c_str()) == 0);
+    JASSERT(rename(ProcessInfo::instance().getTempCkptFilename().c_str(),
+                   ProcessInfo::instance().getCkptFilename().c_str()) == 0);
+    // Inform coordinator that final filename is now in place.
+    CoordinatorAPI::sendCkptFilename();
🧹 Nitpick comments (7)
src/wrappers.cpp (1)

404-407: Loop var type: prefer deduced or signed to match container element type.

vector<int> fds is iterated with unsigned int fd. Not wrong (FDs are non-negative), but it forces a cast and can confuse overload resolution (e.g., passing to functions expecting int). Use auto fd or int fd for clarity.

Apply:

-  for (unsigned int fd : fds) {
+  for (auto fd : fds) {  // fd is int
jalib/jfilesystem.cpp (1)

168-198: Avoid narrowing size_t to int for path lengths and indices.

strlen() returns size_t. Converting to int risks overflow on very long paths and mixes signed/unsigned in loop math. Use size_t (for length) with a signed index for the sentinel -1, or avoid manual scan via strrchr.

Apply:

-  int len = strlen(fname);
+  size_t len = strlen(fname);
@@
-  int lastSlash = -1;
-  for (int i = 0; i < len; i++) {
-    if (fname[i] == '/') {
-      lastSlash = i;
-    }
-  }
+  // Prefer libc scan to avoid manual index types
+  const char* slash = ::strrchr(fname, '/');
+  int lastSlash = slash ? static_cast<int>(slash - fname) : -1;
src/ckptserializer.cpp (5)

78-83: Initialize and document trivial SIGCHLD action.

Good encapsulation with set_trivial_sigchld_action. Consider SA_RESTART if you rely on restarting certain syscalls elsewhere, but current usage with sigsuspend() is fine.


92-108: sigsuspend waits for “a” SIGCHLD, not necessarily from our child.

If other children exist with pending SIGCHLD, sigsuspend() may return before our freshly forked child exits; waitpid(pid, 0, 0) could then block. Likely safe in ckpt thread, but please confirm invariant “no other children can deliver SIGCHLD here” or guard with a non-blocking loop:

-  _real_sigsuspend(&mask_all_but_sigchld);
+  do {
+    if (_real_waitpid(-1, NULL, WNOHANG) == pid) break;
+  } while (_real_sigsuspend(&mask_all_but_sigchld) == -1 && errno == EINTR);

109-143: double_fork(): restore handler in parent only; fine, but consider mask state in grandchild.

Parent restores SIGCHLD action; grandchild inherits blocked signal mask from ckpt thread. If compressor expects signals unmasked, consider unblocking in the execing path, or at least ensure gzip doesn’t rely on them. Otherwise, LGTM.


248-267: Compression pipe setup: consider O_CLOEXEC to reduce fd leaks.

When using pipe2, pass O_CLOEXEC to avoid descriptor leakage if future code adds more exec paths.

-    if (_real_pipe(pipe_fds) == -1) {
+    if (_real_syscall(SYS_pipe2, pipe_fds, O_CLOEXEC) == -1) {

289-345: Always using double_fork() for compressor: OK. Ensure LD_PRELOAD hygiene and fd discipline.

Exec path unsets LD_PRELOAD and dup2s stdin/stdout correctly. One nit: if fd <= STDERR_FILENO, closing conditionals skip it; safe. Consider fcntl(outfd, F_SETFD, FD_CLOEXEC) before exec as belt-and-suspenders; not required.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98ac51e and 8c53751.

📒 Files selected for processing (4)
  • jalib/jfilesystem.cpp (1 hunks)
  • src/ckptserializer.cpp (5 hunks)
  • src/dmtcpworker.cpp (1 hunks)
  • src/wrappers.cpp (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/dmtcpworker.cpp (3)
plugin/pathvirt/pathvirt.cpp (2)
  • rename (687-695)
  • rename (687-687)
src/processinfo.cpp (2)
  • instance (213-220)
  • instance (214-214)
src/coordinatorapi.cpp (2)
  • sendCkptFilename (724-753)
  • sendCkptFilename (725-725)
src/ckptserializer.cpp (3)
src/signalwrappers.cpp (1)
  • sigaction (153-153)
src/nosyscallsreal.c (2)
  • _real_waitpid (339-343)
  • _real_close (188-192)
src/processinfo.cpp (2)
  • instance (213-220)
  • instance (214-214)
🔇 Additional comments (2)
src/ckptserializer.cpp (2)

62-69: Enum-based forked status: good move.

Replacing raw ints with forked_rc_t improves readability and avoids magic numbers. No concerns.


272-284: Forked-ckpt gate: env checked here but compressor path always double-forks.

This is intentional per PR (double-fork for gzip regardless). No issues.

Comment on lines +493 to +509
// NOTE: If FORKED_CHECKPOINTING, the grandchild may still be checkpointing.
// Grandchild process will do renaming and inform coordinator.
if (getenv(ENV_VAR_FORKED_CKPT) == NULL) {
/* Now that temp checkpoint file is complete, rename it over old permanent
* checkpoint file. Uses rename() syscall, which doesn't change i-nodes.
* So, gzip process can continue to write to file even after renaming.
*/
JASSERT(rename(ProcessInfo::instance().getTempCkptFilename().c_str(),
ProcessInfo::instance().getCkptFilename().c_str()) == 0);
CoordinatorAPI::sendCkptFilename();
} else { // else FORKED CHECKPOINTING. Write temp name; grandchild not done.
string ckptFilename = ProcessInfo::instance().getCkptFilename();
ProcessInfo::instance().setCkptFilename(
ProcessInfo::instance().getTempCkptFilename().c_str());
CoordinatorAPI::sendCkptFilename();
ProcessInfo::instance().setCkptFilename(ckptFilename.c_str());
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Coordinator filename mismatch in forked mode (sends .tmp but never updated).

In the FORKED_CKPT branch, parent sends the temp filename to the coordinator, while the grandchild later renames to final .dmtcp. There is no subsequent update to the coordinator, so restart scripts will point to .tmp. Either: (A) keep sending the final name from the parent (recommended, rename happens later but inode stays the same), or (B) have the grandchild send the final name after rename.

Minimal fix (A):

-  } else { // else FORKED CHECKPOINTING.  Write temp name; grandchild not done.
-    string ckptFilename = ProcessInfo::instance().getCkptFilename();
-    ProcessInfo::instance().setCkptFilename(
-                       ProcessInfo::instance().getTempCkptFilename().c_str());
-    CoordinatorAPI::sendCkptFilename();
-    ProcessInfo::instance().setCkptFilename(ckptFilename.c_str());
-  }
+  } else {
+    // Forked mode: still send final filename; grandchild will rename when done.
+    CoordinatorAPI::sendCkptFilename();
+  }

If you prefer (B), ensure the grandchild calls CoordinatorAPI::sendCkptFilename() after the rename (see suggested change in ckptserializer.cpp).



---

🏁 Script executed:

```shell
#!/bin/bash
# Examine the current code in dmtcpworker.cpp around lines 493-509
sed -n '485,520p' src/dmtcpworker.cpp

Length of output: 1572


🏁 Script executed:

#!/bin/bash
# Search for where the grandchild process does the rename operation
rg -nP "rename.*getTempCkptFilename|getTempCkptFilename.*rename" --type=cpp -C5

Length of output: 1830


🏁 Script executed:

#!/bin/bash
# Search for FORKED_CKPT usage in ckptserializer.cpp
rg -nP "FORKED_CKPT|ENV_VAR_FORKED_CKPT" --type=cpp -C3 | head -80

Length of output: 4334


🏁 Script executed:

#!/bin/bash
# Find where CoordinatorAPI::sendCkptFilename is called
rg -nP "sendCkptFilename" --type=cpp -C2

Length of output: 1606


🏁 Script executed:

#!/bin/bash
# Search for grandchild process handling and file renaming logic
rg -nP "grandchild|GRAND_CHILD" --type=cpp -C3 | head -100

Length of output: 2099


The review comment is accurate—coordinator receives .tmp filename but file ends up at final .dmtcp name.

The verification confirms a real bug in forked checkpointing:

  1. Current behavior: Parent sends .tmp to coordinator (line 507), but grandchild renames the file to final .dmtcp (ckptserializer.cpp:411) without notifying coordinator, then exits immediately.

  2. Consequence: Restart scripts record .tmp but the actual checkpoint file exists at the final .dmtcp name—causing restore failures.

  3. Code comment inaccuracy: Lines 493–494 claim the grandchild will "inform coordinator," but ckptserializer.cpp:416 shows it calls _exit(0) without any coordinator communication.

  4. Fix A is safe: Sending the final filename from parent works because rename() preserves i-nodes; the grandchild's subsequent rename operation still succeeds, and the coordinator has the correct final name.

🤖 Prompt for AI Agents
In src/dmtcpworker.cpp around lines 493 to 509, the forked-checkpoint branch
currently sends the .tmp filename to the coordinator but the grandchild later
renames the file to the final .dmtcp name (and exits without notifying), causing
a mismatch; change the else branch so it sends the final checkpoint filename
(ProcessInfo::instance().getCkptFilename()) to the coordinator instead of the
temp name, restore any temporary modifications as needed, and update the comment
to reflect that the parent now sends the final name so coordinator and
filesystem agree.

Copy link
Copy Markdown
Collaborator

@xuyao0127 xuyao0127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current solution looks good enough to me. So I'm approving this PR. But I think we should maintain a consistent filename on disk and in the coordinator. I'm not sure if the filename reported to the coordinator will be used anywhere, but it may cause confusion when debugging in the future.

@gc00
Copy link
Copy Markdown
Contributor Author

gc00 commented Oct 28, 2025

I believe that DMTCP does not rely on whether the DMTCP coordinator thinks the file is *.dmtcp.temp or *.dmtcp. I think it's used only for informational purposes, and the human will be warned when they see *.dmtcp.temp, and hopefully will look if the file has been renamed yet.

I wish there were a clean way to tell the coordinator when the forked ckpt is complete, but anything I can imagine is ugly, because the grandchild cannot talk to the coordinator, and so it would have to signal the parent, or some other weird solution.

@gc00 gc00 merged commit faeeac6 into dmtcp:main Oct 28, 2025
2 checks passed
@gc00 gc00 deleted the double-fork branch October 28, 2025 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants