Fix for forked ckpt: commit 7e486722bdb broke it#1220
Conversation
WalkthroughA conditional compilation guard was added in DmtcpWorker::postCheckpoint() to wrap the Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant W as DmtcpWorker::postCheckpoint()
participant KV as kvdb
rect rgb(236, 248, 255)
note over W: Post-checkpoint metadata writes
alt FORKED_CHECKPOINTING not defined
W->>KV: set("ProcSelfMaps_Ckpt", ...)
else FORKED_CHECKPOINTING defined
note over W: Skip ProcSelfMaps_Ckpt write (compile-time)
end
W->>KV: set("ProcSelfMaps_JAllocArenas", ...)
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
karya0
left a comment
There was a problem hiding this comment.
Looks good.
However, please note that forked checkpoint won't work beyond trivial use cases. We don't have a way to support shared-memory segments that could get overwritten while ckpt is being written. Not that it's not possible to do, but the outcome isn't worth all the trouble.
@karya0 ,
Apparently, your addition of kvdb for DMTCP (commit 7e48672) had not considered the case of forked checkpointing.
I'm commenting out one line of that code when forked ckpt is being used.
Could you please check this, to see if you want to modify kvdb for this case, or just ignore it, since forked ckpt is less common?
To recall, in forked ckpt, the original process figures out open fd's, signal handlers, etc., and then forks a grandchild. The grandchild writes the memory areas to the ckpt_*.temp file. The original process then renames the *.temp file to *.dmtcp, but the grandchild is allowed to continue executing and write out the rest of the memory areas. The grandchild then exits when done.
@xuyao0127,
When we also have the enhancement for SOCK_SEQPACKET, let's add this commit, also, and do a new point release for DMTCP.
Thanks.
Summary by CodeRabbit
Bug Fixes
Documentation