Skip to content

Fixed a bug in Util::mmap_fixed_noreplace#1197

Merged
xuyao0127 merged 1 commit intodmtcp:mainfrom
xuyao0127:fix_mmap_fixd_noreplace
Mar 17, 2025
Merged

Fixed a bug in Util::mmap_fixed_noreplace#1197
xuyao0127 merged 1 commit intodmtcp:mainfrom
xuyao0127:fix_mmap_fixd_noreplace

Conversation

@xuyao0127
Copy link
Copy Markdown
Collaborator

Util::mmap_fixed_noreplace emulates the mmap() with MAP_FIXED_NOREPLACE
flag for old systems. It uses a regular mmap() with the given address
and check if the return value is the same as the given address. However,
we found a case that the kernel returns a different address although the
requirest memory range is available. Therefore, the mmap_fixed_noreplace
function returns an error message EEXIST, which is not expectec.

Now we check if the requested memory range is available by calling mlock()
page by page. The mlock function will return -1 with an errono of ENOMEM.
If there's no already mapped pages in the request memory range, the function
will call mmap() with MAP_FIXED. Otherwise, the function will return
MAP_FAILED with an errno of EEXIST.

@xuyao0127 xuyao0127 added the bug label Mar 17, 2025
@xuyao0127 xuyao0127 requested review from gc00 and karya0 March 17, 2025 00:55
Copy link
Copy Markdown
Contributor

@gc00 gc00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving in advance. Please seem my inline comments.

// by calling mlock page by page. If the mlock fails and the error
// is ENOMEM, we know that this page is not mapped.
//
// This only works for private memory, not shared memory.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works -> Using mlock here only works

munlock(addr, len);
break;
} else {
fprintf(stderr, "DMTCP cannot map addr %p\n", addr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but we could add:

fprintf(stderr, "%s(%d): DMTCP cannot map addr %p\n", __FILE__, __LINE__, addr);

It save the user from grep'ing for this line.

} else if (addr2 != MAP_FAILED) {
// undo the mmap
munmap(addr2, len);
// We check if the memory range has any overlap with existing memory
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reformat this to the usual 80 columns, but let's add a comment like:

// The Linux kernel is observed to _not_ use the address hint even for mapping a single page, when the previous and next pages are mapped, but the current page address is unmapped

  Util::mmap_fixed_noreplace emulates the mmap() with MAP_FIXED_NOREPLACE
  flag for old systems. It uses a regular mmap() with the given address
  and check if the return value is the same as the given address. However,
  we found a case that the kernel returns a different address although the
  requirest memory range is available. Therefore, the mmap_fixed_noreplace
  function returns an error message EEXIST, which is not expectec.

  Now we check if the requested memory range is available by calling mlock()
  page by page. The mlock function will return -1 with an errono of ENOMEM.
  If there's no already mapped pages in the request memory range, the function
  will call mmap() with MAP_FIXED. Otherwise, the function will return
  MAP_FAILED with an errno of EEXIST.
@xuyao0127 xuyao0127 force-pushed the fix_mmap_fixd_noreplace branch from 7f9a5c4 to 43fdb32 Compare March 17, 2025 18:47
@xuyao0127 xuyao0127 merged commit 2cf6ed4 into dmtcp:main Mar 17, 2025
1 check passed
@xuyao0127 xuyao0127 deleted the fix_mmap_fixd_noreplace branch March 17, 2025 18:57
// This flag should force: 'addr == addr2' or 'addr2 == MAP_FAILED'
flags |= MAP_FIXED_NOREPLACE;
#endif
void *addr2 = mmap(addr, len, prot, flags, fd, offset);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this kernel issue happens rarely, can we try the previous route of using vanilla mmap and see if the kernel returns the correct address. If it doesn't return the correct address, we can go ahead and try the slow route of using mlocks on each page to see what's going on.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add this in another PR. This function is only used when creating the sharedDatatHeader for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants