Skip to content

[8.0] MOD-8391: Report active threads-indexes upon crash#5842

Merged
kei-nan merged 1 commit into8.0from
backport-5403-to-8.0
Mar 30, 2025
Merged

[8.0] MOD-8391: Report active threads-indexes upon crash#5842
kei-nan merged 1 commit into8.0from
backport-5403-to-8.0

Conversation

@kei-nan
Copy link
Collaborator

@kei-nan kei-nan commented Mar 30, 2025

Manual backport of #5403.

* Add new activeThreads API

* Fix docs, add container populatoin

* Add logging in crash

* Add Linux thread id to log, add GC thread to active threads container

* Add main-thread registrations

* Fix

* fix initialization

* Address review

* Fix leak

* Fix assertion

* Touchup

* * pivot
- keep track of the active queries and cursors in the main thread when blocking the client
- keep track at each thread the spec it is working on.

* * fix spellcheck error
* fix compilation

* * fix spellcheck typo

* * fix tests

* * wrong function name - should be CaseInsensitiveCompare

* * revert

* * fix typos

* * fix some tests

* * bring back old code that was removed by mistake

* * add a dtor function to the thread local variable to avoid reallocating on every set

* * do not omit argument name - fixes compilation error on some operating systems

* * add missing includes

* * change include order

* * cleanup memory in CurrentThread_ClearIndexSpec to avoid thread lifetime leak.

* * try and upload the so library as an artifact on failure

* * fix typo

* * also upload the debug symbols as an artifact

* * add debug symbols to diagnose crash

* * suspect the reference mechanism is not atomic enough
* unified strong and weak ref count into a single variable to ensure atomicity

* * change Strong value and make it 64 bit

* * revert some changes that were made to locate the gc crash
* don't set the index spec in the gc thread

* * missed a set call - commented it out

* * fix gc flow - use weak ref to take into account race condition of index deletion after first strong ref is taken.
* try and use free callback for thread local variable

* * update comment
* add a log in case thread forgot to cleanup

* * remove unneeded fields
* code cleanup
* duplicate index name to output something if promote fails during crash

* * fix tests

* * fix return for ThreadLocalStorage_Init

* * add CurrentThread_TryGetSpecInfo
* use CurrentThread_TryGetSpecInfo in info_redis.c
* some code cleanup

* * fix compilation error

* * clear spec in DistAggregateCleanups

* * code review comments

* * fix typo

* * code review comments
* restructured the files, moved them to be under info folder
* cursor object is retrieved in main thread
  * if it is a shard cursor we will add it to the blocked queries cursor list

* * try and take a strong ref when dealing in cursor query

* * add result processor when using debug search commands to allow crashing the query
* add a test to check the additional crash information is correct

* * small code review fixes

* * change error so test will pass

* use correct submodule

* * code review comments - Andres

* - make set and clear part of the ref promote and release.
- found a few more places where we could set the index spec

* - fix leak

* - raz code review comments

---------

Co-authored-by: jonathan keinan <[email protected]>
(cherry picked from commit 384a2f6)
@kei-nan kei-nan requested review from nafraf and raz-mon March 30, 2025 14:45
@codecov
Copy link

codecov bot commented Mar 30, 2025

Codecov Report

Attention: Patch coverage is 58.16993% with 128 lines in your changes missing coverage. Please review.

Project coverage is 87.25%. Comparing base (b94e803) to head (7a2325c).
Report is 2 commits behind head on 8.0.

Files with missing lines Patch % Lines
src/info/info_redis/info_redis.c 2.32% 42 Missing ⚠️
src/info/info_redis/block_client.c 0.00% 26 Missing ⚠️
src/info/info_redis/types/blocked_queries.c 31.25% 22 Missing ⚠️
src/result_processor.c 36.00% 16 Missing ⚠️
src/aggregate/aggregate_exec.c 68.18% 7 Missing ⚠️
src/module.c 90.90% 4 Missing ⚠️
src/coord/dist_plan.cpp 66.66% 2 Missing ⚠️
src/fork_gc.c 89.47% 2 Missing ⚠️
src/info/info_redis/threads/current_thread.c 93.54% 2 Missing ⚠️
src/info/info_redis/threads/main_thread.c 87.50% 2 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##              8.0    #5842      +/-   ##
==========================================
- Coverage   88.11%   87.25%   -0.87%     
==========================================
  Files         202      206       +4     
  Lines       36287    36519     +232     
==========================================
- Hits        31976    31864     -112     
- Misses       4311     4655     +344     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kei-nan kei-nan enabled auto-merge March 30, 2025 16:14
@kei-nan kei-nan self-assigned this Mar 30, 2025
@kei-nan kei-nan added this pull request to the merge queue Mar 30, 2025
Merged via the queue into 8.0 with commit 8c1fe60 Mar 30, 2025
7 of 8 checks passed
@kei-nan kei-nan deleted the backport-5403-to-8.0 branch March 30, 2025 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants