Fixes memory leak when the job crashes before it's freed#3178
Fixes memory leak when the job crashes before it's freed#3178madolson merged 7 commits intovalkey-io:unstablefrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3178 +/- ##
================================
================================
🚀 New features to boost your workflow:
|
|
@madolson looks like valgrind is green with this PR: https://github.com/sarthakaggarwal97/valkey/actions/runs/21770227707 Do you have any feedbacks on this change on a high level? I know we discussed that this is a bit inefficient. |
Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Sarthak Aggarwal <[email protected]>
Signed-off-by: Sarthak Aggarwal <[email protected]>
94d1513 to
b992da4
Compare
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
) If there is a crash between the time the job is popped and freed, we technically leak memory. This change allows us to peek, and pop just before we are about to free the job. Fixes valgrind errors: https://github.com/valkey-io/valkey/actions/runs/21969557125/job/63467641572#step:6:8648 --------- Signed-off-by: Sarthak Aggarwal <[email protected]>
|
@madolson Good to see that this change resolves the test issue. However, I'm not keen on the API change. It's common/typical for a mutex queue to support multiple readers. By adding a I'll look into alternatives. |
Another thing we considered was to have a global/static pointer which keeps track of the per-thread objects. So even when the thread is killed, there is still a global reference so valgrind is happy. I am indifferent to the new API, but we needed a fix for the valgrind issue. |
…3256) ## Problem The peek-then-pop pattern introduced in #3178 solved the valgrind false-positive leak report (https://github.com/valkey-io/valkey/actions/runs/21969557125/job/63467641572#step:6:8648) for in-flight BIO jobs, but `peek` is a problematic API on a mutexqueue: multiple readers can peek the same item, and one reader can pop what another peeked. ## Fix Instead, store the in-flight job pointer in `bio_worker_data.current_job` before processing and clear it after freeing. Since `bio_workers[]` is a static array, valgrind can always trace from global memory to the job allocation, even after the worker thread is cancelled. Removed `mutexQueuePeek` from mutexqueue.{c,h} and its test. ## Test Manually run Daily workflow: All four valgrind jobs are green: `test-valgrind-test` passed, `test-valgrind-misc` passed, `test-valgrind-no-malloc-usable-size-test` passed, `test-valgrind-no-malloc-usable-size-misc` passed. Signed-off-by: Alina Liu <[email protected]>
If there is a crash between the time the job is popped and freed, we technically leak memory. This change allows us to peek, and pop just before we are about to free the job.
Fixes valgrind errors: https://github.com/valkey-io/valkey/actions/runs/21969557125/job/63467641572#step:6:8648