Skip to content

[8.2] MOD-13602 Add queue time tracking to FT.PROFILE#8235

Merged
lerman25 merged 3 commits into8.2from
backport-8210-to-8.2
Feb 2, 2026
Merged

[8.2] MOD-13602 Add queue time tracking to FT.PROFILE#8235
lerman25 merged 3 commits into8.2from
backport-8210-to-8.2

Conversation

@ofiryanai
Copy link
Copy Markdown
Collaborator

@ofiryanai ofiryanai commented Feb 1, 2026

Current: FT.PROFILE's "Parsing time" incorrectly includes time spent waiting in thread pool queues, making it difficult to diagnose performance issues.

Change: Add separate "Workers queue time" and "Coordinator queue time" fields to FT.PROFILE output, and fix "Parsing time" to exclude queue wait time.

Outcome: Users can now accurately see how long queries wait in thread pool queues vs actual parsing/execution time.

Main objects modified

  • src/aggregate/aggregate.h / aggregate_exec.c - Workers queue time tracking
  • src/concurrent_ctx.h / module.c - Coordinator queue time tracking
  • src/profile.c - Profile output formatting

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Release Notes

  • This PR requires release notes
  • This PR does not require release notes

Note

Medium Risk
Changes the FT.PROFILE output schema (new fields and shifted RESP2 positions) and adjusts timing calculations across worker/coordinator execution paths, which can affect clients parsing profile responses and the accuracy of reported timings.

Overview
FT.PROFILE now reports queue wait separately from execution timing. It adds "Workers queue time" to shard profiles (captured when a background job starts) and adjusts "Parsing time" to exclude that wait.

In cluster mode, it also tracks and outputs "Coordinator queue time" by measuring time spent waiting in the coordinator thread pool before dispatch, and threads this value through request contexts so it appears in the coordinator profile section. Tests were updated and expanded to validate the new fields and ensure parsing time no longer includes queue delay.

Written by Cursor Bugbot for commit cabbf35. This will update automatically on new commits. Configure here.

@ofiryanai ofiryanai changed the base branch from master to 8.2 February 1, 2026 13:11
@jit-ci

This comment has been minimized.

@ofiryanai ofiryanai marked this pull request as draft February 1, 2026 13:33
@ofiryanai ofiryanai marked this pull request as ready for review February 1, 2026 13:33
@github-actions github-actions Bot added the size:M label Feb 2, 2026
@jit-ci

This comment has been minimized.

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.89%. Comparing base (be0ad27) to head (cabbf35).
⚠️ Report is 1 commits behind head on 8.2.

Additional details and impacted files
@@            Coverage Diff             @@
##              8.2    #8235      +/-   ##
==========================================
- Coverage   88.94%   88.89%   -0.06%     
==========================================
  Files         259      259              
  Lines       41934    41950      +16     
  Branches     3851     3851              
==========================================
- Hits        37297    37290       -7     
- Misses       4588     4611      +23     
  Partials       49       49              
Flag Coverage Δ
flow 82.44% <100.00%> (-0.07%) ⬇️
unit 46.80% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ofiryanai ofiryanai requested a review from lerman25 February 2, 2026 10:11
@jit-ci

This comment has been minimized.

ofiryanai and others added 3 commits February 2, 2026 17:19
* Add validation tests for FT.PROFILE queue time bug

Add design document and validation tests that confirm the bug exists:
- testParsingTimeIncludesWorkersQueueTime_BUG: Confirms workers queue
  wait time is incorrectly included in 'Parsing time' (bug)
- testParsingTimeDoesNotIncludeCoordQueueTime: Confirms coordinator
  queue time is correctly separate from shard's Parsing time

Both tests pass, confirming the bug exists as described in the design doc.

* Implement Part 1: Workers queue time in FT.PROFILE

- Add profileQueueTime field to AREQ struct in aggregate.h
- Capture queue time at start of AREQ_Execute_Callback() in aggregate_exec.c
  - Queue time = elapsed time since initClock was set before enqueueing
  - Reset initClock after capturing queue time for accurate parsing time
- Print 'Workers queue time' in profile output in profile.c
- Add testWorkersQueueTimeInProfile test to verify the fix
- Mark testParsingTimeIncludesWorkersQueueTime_BUG as skipped (bug is fixed)

This fixes the bug where FT.PROFILE's 'Parsing time' incorrectly included
time spent waiting in the workers thread pool queue.

* Add coordinator queue time tracking to FT.PROFILE

Track time spent waiting in the coordinator thread pool queue and
report it as 'Coordinator queue time' in FT.PROFILE output.

Changes:
- Add coordQueueTime field to ConcurrentSearchHandlerCtx struct
- Add coordQueueTime field to searchRequestCtx struct
- Calculate queue time in DistSearchCommandHandler and DEBUG_DistSearchCommandHandler
- Copy queue time to searchRequestCtx in FlatSearchCommandHandler
- Print 'Coordinator queue time' in profileSearchReplyCoordinator

* Add test for coordinator queue time in FT.PROFILE

Add testCoordinatorQueueTimeInProfile to verify that coordinator queue
time is correctly captured in cluster mode when the coordinator thread
pool is paused.

* Remove design document (intermediate artifact)

* Fix tests for new Workers queue time and Coordinator queue time fields

* Address Cursor Bugbot review: remove obsolete test and unused env parameters

* Add AGENTS.local.md to .gitignore

* Fix test expectations for new Workers queue time field in FT.PROFILE output

* Remove redundant comments from queue time tracking code

* Fix test_RED_86036 index after adding Workers queue time to FT.PROFILE

* Fix dead code issue

* Revert unrelated changes to AGENTS.md and .gitignore

* CR- larger debug pauses, clock only on profile and different output order

* CR- dont init clock

(cherry picked from commit 7622a9e)
(cherry picked from commit 335414c)
The test was using hardcoded indices to access the profile output,
which broke after adding 'Workers queue time' field. Now uses
to_dict() to access 'Iterators profile' by key name, making the
test more robust to profile structure changes.
Backport the ConcurrentSearchHandlerCtx approach from 8.6 to pass
isProfile, coordStartTime, and coordQueueTime instead of adding
individual parameters to SearchCmdCtx or dispatched method signatures.
@ofiryanai ofiryanai force-pushed the backport-8210-to-8.2 branch from 5ee1df8 to cabbf35 Compare February 2, 2026 15:22
@jit-ci
Copy link
Copy Markdown

jit-ci Bot commented Feb 2, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

🚨 Summary

• No security vulnerabilities detected in this PR

🚫 Ignored Findings (1)

HIGH (1)
  • Image user should not be 'root' in Dockerfile — reason: ignored
    ↪ Unignore: @sera unignore b3fd9d3b12

▶️ Advanced Options

Help section with supported ignore actions and syntax.

Need to ignore a finding? Use commands like:

@sera ignore <finding_id> reason:accepted (reason is optional)

@sera ignore all reason:other - this is a test PR (reason is optional)

@sera ignore type:DS002


Security scan by Jit

@lerman25 lerman25 enabled auto-merge February 2, 2026 15:49
@lerman25 lerman25 added this pull request to the merge queue Feb 2, 2026
Merged via the queue into 8.2 with commit e0ac8ec Feb 2, 2026
32 checks passed
@lerman25 lerman25 deleted the backport-8210-to-8.2 branch February 2, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants