Skip to content

Replace mutexQueuePeek with current_job field for BIO memcheck fix#3256

Merged
madolson merged 1 commit intovalkey-io:unstablefrom
asagege:alina_bio_fix
Feb 25, 2026
Merged

Replace mutexQueuePeek with current_job field for BIO memcheck fix#3256
madolson merged 1 commit intovalkey-io:unstablefrom
asagege:alina_bio_fix

Conversation

@asagege
Copy link
Contributor

@asagege asagege commented Feb 25, 2026

Problem

The peek-then-pop pattern introduced in #3178 solved the valgrind false-positive leak report (https://github.com/valkey-io/valkey/actions/runs/21969557125/job/63467641572#step:6:8648) for in-flight BIO jobs, but peek is a problematic API on a mutexqueue: multiple readers can peek the same item, and one reader can pop what another peeked.

Fix

Instead, store the in-flight job pointer in bio_worker_data.current_job before processing and clear it after freeing. Since bio_workers[] is a static array, valgrind can always trace from global memory to the job allocation, even after the worker thread is cancelled.

Removed mutexQueuePeek from mutexqueue.{c,h} and its test.

Test

Manually run Daily workflow: All four valgrind jobs are green: test-valgrind-test passed, test-valgrind-misc passed, test-valgrind-no-malloc-usable-size-test passed, test-valgrind-no-malloc-usable-size-misc passed.

Copy link
Member

@JimB123 JimB123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alina. This looks like a simpler solution and solves the problem in BIO where the unclean thread termination occurs. It removes the PEEK API from the mutexQueue as "peek" is awkward in MPMC situations.

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems ok, and one fewer API maintain.

@madolson madolson merged commit 39332bd into valkey-io:unstable Feb 25, 2026
57 checks passed
@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (c471364) to head (91ce382).
⚠️ Report is 1 commits behind head on unstable.

Additional details and impacted files
@@       Coverage Diff        @@
##   unstable   #3256   +/-   ##
================================
================================
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sarthakaggarwal97
Copy link
Contributor

LGTM thanks!

JimB123 pushed a commit that referenced this pull request Feb 28, 2026
Suppress valgrind for BIO jobs valgrind errors:
https://github.com/valkey-io/valkey/actions/runs/21969557125/job/63467641572#step:6:8648

## Changes

- Add `allocBioJob()` with `__attribute__((noinline))` as a centralized
allocation function for all BIO jobs. The `noinline` attribute ensures
it appears as a distinct frame in valgrind stack traces.
- Replace all direct `zmalloc` calls in `bioCreate*Job` functions
with`allocBioJob()`.
- Add a valgrind suppression in `src/valgrind.sup` matching definite
leaks with `allocBioJob` in the call stack.
- Remove `current_job` introduced in #3256

---------

Signed-off-by: Alina Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants