Skip to content

MOD-13735 MOD-13181 Fix FT.PROFILE shard total profile time#8129

Merged
ofiryanai merged 5 commits intomasterfrom
fix-profile-shard-total-time
Jan 22, 2026
Merged

MOD-13735 MOD-13181 Fix FT.PROFILE shard total profile time#8129
ofiryanai merged 5 commits intomasterfrom
fix-profile-shard-total-time

Conversation

@ofiryanai
Copy link
Contributor

@ofiryanai ofiryanai commented Jan 22, 2026

Problem

In cluster mode, FT.PROFILE reports incorrect "Total profile time" values for each shard. The total time appears smaller than individual Result Processor times within the same shard, which is mathematically impossible since the total should encompass all processor times. This even worse because some non-negligible time is not included in any of the result processors times (e.g shard results serialization in cursor read).

Root Cause

The issue occurs when internal cursors are used (cluster mode with FT.AGGREGATE). The timing calculation has a mismatch between how total time and individual processor times are tracked:

  1. initClock is reset on each cursor read at src/aggregate/aggregate_exec.c:1382:
rs_wall_clock_init(&req->initClock); // Reset the clock for the current cursor read
  1. profileTotalTime is calculated only from the last cursor read in src/profile.c:145:
req->profileTotalTime += rs_wall_clock_elapsed_ns(&req->initClock);

This only captures elapsed time since the last initClock reset.

  1. Result Processor times accumulate across ALL cursor reads in src/result_processor.c:1144:
self->profileTime += rs_wall_clock_elapsed_ns(&start);

These times are never reset between cursor reads.

Result: When multiple cursor reads occur, RP times accumulate across all reads while total time only measures the final read.

Fix

Accumulate profileTotalTime at the end of each cursor read in finishSendChunk(), matching how RP times are accumulated. This ensures total profile time reflects the sum of all cursor read durations, maintaining mathematical consistency with individual processor times.


Note

Fixes shard profiling time under cursored executions by accumulating per-read durations into req->profileTotalTime.

  • Update finishSendChunk() in aggregate_exec.c to add elapsed time for intermediate cursor reads to profileTotalTime (final read remains accounted in Profile_Print); use the measured duration for global stats
  • Add tests in tests/pytests/test_profile.py to verify timing consistency: Total profile time ≥ sum of Result Processor Time across RESP2/RESP3, cluster/standalone, and various SEARCH/AGGREGATE pipelines; include helper sum_rp_times() and ProfileTotalTimeConsistency()

Written by Cursor Bugbot for commit 382aa9d. This will update automatically on new commits. Configure here.

@jit-ci
Copy link

jit-ci bot commented Jan 22, 2026

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.26%. Comparing base (a6bb950) to head (382aa9d).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8129      +/-   ##
==========================================
+ Coverage   83.21%   83.26%   +0.05%     
==========================================
  Files         369      369              
  Lines       55455    55457       +2     
  Branches    14432    14432              
==========================================
+ Hits        46149    46179      +30     
+ Misses       9148     9120      -28     
  Partials      158      158              
Flag Coverage Δ
flow 84.53% <100.00%> (+0.08%) ⬆️
unit 50.29% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ofiryanai ofiryanai force-pushed the fix-profile-shard-total-time branch from 6f0ca61 to 72365ad Compare January 22, 2026 14:56
Copy link
Collaborator

@lerman25 lerman25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Should we backport?

@ofiryanai ofiryanai enabled auto-merge January 22, 2026 16:41
@ofiryanai ofiryanai marked this pull request as draft January 22, 2026 16:51
auto-merge was automatically disabled January 22, 2026 16:51

Pull request was converted to draft

@ofiryanai ofiryanai marked this pull request as ready for review January 22, 2026 16:51
@ofiryanai ofiryanai enabled auto-merge January 22, 2026 17:20
@ofiryanai ofiryanai added this pull request to the merge queue Jan 22, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 22, 2026
@ofiryanai ofiryanai added this pull request to the merge queue Jan 22, 2026
Merged via the queue into master with commit fd63e0c Jan 22, 2026
96 checks passed
@ofiryanai ofiryanai deleted the fix-profile-shard-total-time branch January 22, 2026 19:52
@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.8, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.8
git worktree add -d .worktree/backport-8129-to-2.8 origin/2.8
cd .worktree/backport-8129-to-2.8
git switch --create backport-8129-to-2.8
git cherry-pick -x fd63e0c3cf864804885bb2b2688e4444d861b6aa

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 2.10, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 2.10
git worktree add -d .worktree/backport-8129-to-2.10 origin/2.10
cd .worktree/backport-8129-to-2.10
git switch --create backport-8129-to-2.10
git cherry-pick -x fd63e0c3cf864804885bb2b2688e4444d861b6aa

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.2
git worktree add -d .worktree/backport-8129-to-8.2 origin/8.2
cd .worktree/backport-8129-to-8.2
git switch --create backport-8129-to-8.2
git cherry-pick -x fd63e0c3cf864804885bb2b2688e4444d861b6aa

@redisearch-backport-pull-request
Copy link
Contributor

Backport failed for 8.4, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.4
git worktree add -d .worktree/backport-8129-to-8.4 origin/8.4
cd .worktree/backport-8129-to-8.4
git switch --create backport-8129-to-8.4
git cherry-pick -x fd63e0c3cf864804885bb2b2688e4444d861b6aa

redisearch-backport-pull-request bot pushed a commit that referenced this pull request Jan 22, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
@redisearch-backport-pull-request
Copy link
Contributor

Successfully created backport PR for 8.6:

ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
@ofiryanai ofiryanai changed the title MOD-13181 Fix FT.PROFILE shard total profile time MOD-13735 MOD-13181 Fix FT.PROFILE shard total profile time Jan 25, 2026
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
github-merge-queue bot pushed a commit that referenced this pull request Jan 25, 2026
… (#8150)

MOD-13181 Fix FT.PROFILE shard total profile time (#8129)

* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
github-merge-queue bot pushed a commit that referenced this pull request Jan 25, 2026
… (#8151)

MOD-13181 Fix FT.PROFILE shard total profile time (#8129)

* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
ofiryanai added a commit that referenced this pull request Jan 25, 2026
* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)
github-merge-queue bot pushed a commit that referenced this pull request Jan 27, 2026
…8154)

* MOD-13181 Fix FT.PROFILE shard total profile time (#8129)

* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)

* passing tests

* simplify tests

* pass also coord tests

* simplify extract_profile_coordinator_and_shards

* Fix profile parsing to handle FT.SEARCH and FT.AGGREGATE

Original get_shards_profile only handled FT.AGGREGATE (assumed res['Shards']).
FT.SEARCH has different structure: lowercase 'shards' with nested Coordinator
(RESP3), different flat list format (RESP2).

New extract_profile_coordinator_and_shards handles all 4 cases (FT.SEARCH/
FT.AGGREGATE × RESP2/RESP3) and extracts Coordinator data needed by
testProfileTotalTimeConsistencyCluster* tests.

* Fix profile parsing for standalone mode

Handle standalone mode in extract_profile_coordinator_and_shards by
returning empty coordinator and shards when not in cluster mode.

* Fix bug
github-merge-queue bot pushed a commit that referenced this pull request Jan 28, 2026
)

* pass also coord tests

* MOD-13181 Fix FT.PROFILE shard total profile time (#8129)

* Fix FT.PROFILE shard total profile time

* Avoid double count on non-cursor Timeout Return failures

* Move out of the if (Ok | Timeout) block

* Add tests

* Enhance tests to cover more cases

(cherry picked from commit fd63e0c)

* fix

* Align test_profile.py helper functions with 2.10 backport

* Fix RESP2 parsing to capture all values after key

Fixed parse_resp2_shards_list and standalone format parsing to use
item[1:] instead of item[1] to capture ALL values after the key.
This fixes multi-valued fields like 'Result processors profile'
which have format ['key', val1, val2, ...].

Added single-element unwrapping to maintain backward compatibility.

* Fix RESP3 parsing to handle case-insensitive keys and filter Coordinator

- Check for both 'shards'/'Shards' and 'coordinator'/'Coordinator' keys
- Filter out Coordinator from shards list (has 'Total Coordinator time' not 'Total profile time')
- Handle standalone mode with 'profile' key

* Fix RESP3 Coordinator filter to use negative check

Filter out entries with 'Total Coordinator time' key instead of filtering
for entries with 'Total profile time', since shards may not always have
the 'Total profile time' field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants