Fixed readdmtcp.sh and minor cleanup for restore buf handling.#1188
Fixed readdmtcp.sh and minor cleanup for restore buf handling.#1188
Conversation
gc00
left a comment
There was a problem hiding this comment.
I'm worried aboutt the second commit. It says Minor cleanup for restoreBuf handling.. But it seems liek we're removing various comments that would help other developers. Usually, I don't anticipate that a "Minor cleanup" should remove documentation, instead of updaating it to the most recent code. :-) (joke)
Could you update the comments? If we're no longer worried about overlap leaving us no place for a restorebuf hole on restart, ccould you document why this will never be a problem? Even better, can you also add a JASSERT to check for any implicit assumptions that you're making (along with comments about why)? Thanks.
| // FIXME: ProcessInfo::init() sets len to RESTORE_TOTAL_SIZE. This | ||
| // then sets it to _restoreBufLen. But ProcessInfo::init() had | ||
| // previously set _restoreBufLen to RESTORE_TOTAL_SIZE. So, we | ||
| // have now made the round trip. This is spaghetti code. :-( |
There was a problem hiding this comment.
I'm not sure why you removed the comment '// NOTE: This note explains ...`
At the very least, this comment should be useful:
// See the comment inside writeckpt.cpp:mtcp_writememoryareas() for
// the purpose of the "restoreBuf" (aka "holebase").
But in addition, isn't the point about mmap'ing 3*_restoreBufLen still relevant?
If you could guarantee that the restoreBuf is always at most 4 KB (one page), then maybe we wouldn't need the 3*_restoreBufLen . But then if were assuming that the restoreBuf is 4KB, then we need to do a JASSERT to verify this fact. Further, whatever code we use, will this be valid for HUGEPAGES?
Whatever we decide, it seems like we need to docuent what we're doing here, for future maintainability.
There was a problem hiding this comment.
I'm not sure why you removed the comment '// NOTE: This note explains ...` At the very least, this comment should be useful:
// See the comment inside writeckpt.cpp:mtcp_writememoryareas() for // the purpose of the "restoreBuf" (aka "holebase").But in addition, isn't the point about mmap'ing
3*_restoreBufLenstill relevant? If you could guarantee that the restoreBuf is always at most 4 KB (one page), then maybe we wouldn't need the3*_restoreBufLen. But then if were assuming that the restoreBuf is 4KB, then we need to do a JASSERT to verify this fact. Further, whatever code we use, will this be valid for HUGEPAGES?Whatever we decide, it seems like we need to docuent what we're doing here, for future maintainability.
The comment is now inside mtcp_restart but I can add a reference to it here.
| * restart() is heavily overloaded, but assuming it's only relevant | ||
| * to processinfo.cpp, we have it used in: ProcessInfo::restart() | ||
| * and 'case DMTCP_EVENT_RESTART' in processInfo_EventHook(). | ||
| * |
There was a problem hiding this comment.
And this is the related comment. See my above inline comment.
Conceptually, we have non-local code. Data structures must be maintained consistently between writeckpt.cpp and mtcp_restart.c So, we need to document the consistency requirements in both places. Otherwise, new developers have a hard time to understand the code.
There was a problem hiding this comment.
I couldn't agree more. The three multiplier was mtcp-specific logic that we split across processInfo and write-ckpt but I have now moved it to mtcp_restart. We can probably cleanup some more of writeckpt logic, but we can defer it for now.
| @@ -190,14 +184,7 @@ mtcp_writememoryareas(int fd) | |||
| // will invoke mmap if the JAlloc arena is full. Similarly, for STL objects | |||
| // such as vector and string. | |||
|
|
|||
There was a problem hiding this comment.
And I have a similar worry about removing this comment, and not documenting our consistency/design requirements between writeckpt.cpp and mtcp_restart.c.
|
I agree it looks bigger than a minor cleanup, at least from the number of lines. However, the majority of comments are not removed, they are just moved to mtcp_restart. We are keeping all the logic related to bigger size in mtcp_restart.c only. The ProcessInfo/WriteCkpt locations are now back to using a single size that's defined in dmtcp.h, which in turn is also used by mtcp_restart. Effectively, we are doing the following:
In a way, this PR consolidates the restore-buf size handling code by putting it all inside mtcp_restart. |
gc00
left a comment
There was a problem hiding this comment.
LGTM.
Thanks for pointing out that the comment about three times the restoreBuf is still there -- just in mtcp_restart.c now. That works. And as you point out, the writeckpt.cpp code initially just creates an arbitrary buffer size, and mtcp_restart.c is the one that views it as three separate buffers, in case of overlap.
Thanks.
PR #1144 broke readdmtcp.sh working which is now fixed in this PR.
Also cleaned up a bit of logic around restore buffer handling.