-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Investigate memory-related unstable latencies in performance tests #11121
Copy link
Copy link
Closed
Labels
Description
A list for me not to forget.
I've seen two main sources of unpredictable memory-related latency:
- Some pages we receive from the allocator may be newly allocated from the OS and still be copy-on-write references to zero page. On write to such pages, page fault occurs, and a new writable page is allocated and filled with zeros.
- Sometimes jemalloc decides that it's time to purge the arena and peforms MADV_DONTNEED on some pages.
What can we do about this:
- Increase (or maybe decrease instead?) the decay time for arenas, governed by the
arenas.dirty_decay_msandarenas.muzzy_decay_msparameters. Read the docs carefully, because it's not obvious what they influence. Tried to do this once, but ran into some Weirdness with our cmake config that prevented me from usingmallctl(USE_JEMALLOC not defined for Server.cpp). A sample snippet forServer::initialize:
+
+#if USE_JEMALLOC
+ const int64_t decay_time_ms = 10 * 60 * 1000;
+ logger().information("Will set jemalloc decay time to %ls ms\n", decay_time_ms);
+ logger().information("test");
+
+ int err = mallctl("arenas.dirty_decay_ms", nullptr, nullptr, &decay_time_ms, sizeof(decay_time_ms));
+ if (err != 0)
+ logger().error("Failed to set 'arenas.dirty_decay_ms' with code %ld", err);
+
+ err = mallctl("arenas.muzzy_decay_ms", nullptr, nullptr, &decay_time_ms, sizeof(decay_time_ms));
+ if (err != 0)
+ logger().error("Failed to set 'arenas.muzzy_decay_ms' with code %ld", err);
+#endif
+
- Decay the arenas explicitly using
arenas.<i>.decaywithi = MALLCTL_ARENAS_ALL. This could be a hidden system query e.g.system reset memory, and we could call it before we start test runs of each query from performance tests.
Reactions are currently unavailable