Skip to content

Fix/dynamic json oom tiny part storm backup(FOR CI)#1127

Closed
yokofly wants to merge 9 commits intodevelopfrom
fix/dynamic-json-oom-tiny-part-storm-backup
Closed

Fix/dynamic json oom tiny part storm backup(FOR CI)#1127
yokofly wants to merge 9 commits intodevelopfrom
fix/dynamic-json-oom-tiny-part-storm-backup

Conversation

@yokofly
Copy link
Copy Markdown
Collaborator

@yokofly yokofly commented Mar 11, 2026

PR checklist:

  • Did you run ClangFormat ?
  • Did you separate headers to a different section in existing community code base ?
  • Did you surround proton: starts/ends for new code in existing community code base ?

Please write user-readable short description of the changes:

SBALAVIGNESH123 and others added 8 commits March 9, 2026 02:12
…d OOM (#1113)

Root cause: hasDynamicSubcolumns() triggered doCommit() for every single JSON
record, creating 1 part per record. With 128 shards and high throughput, this
caused thousands of tiny parts that exhausted the commit pool, triggered
TOO_MANY_PARTS, and finally OOM during loadOutdatedDataParts on restart.

Changes:

1. Batch JSON commits (StreamShardStore.cpp):
   Accumulate records until 8192 rows or 16MB bytes before committing,
   matching non-JSON block behavior. The final batch is committed at the
   end of each consume cycle. Part count reduced from O(records) to
   O(records / 8192).

2. Exponential backoff on pool saturation (StreamShardStore.cpp):
   When part_commit_pool.trySchedule() fails, sleep with exponential
   backoff (100ms -> 200ms -> ... -> 5s cap) instead of immediately
   retrying. Gives background merge threads time to compact parts.

3. Configurable thresholds (Settings.h):
   Added dynamic_commit_row_threshold (default: 8192) and
   dynamic_commit_byte_threshold (default: 16MB) as CONFIGURABLE_GLOBAL_SETTINGS
   for operators to tune based on workload.

4. Observability:
   Added LOG_DEBUG for threshold-triggered commits with sn_range info.
   Added LOG_TRACE for batch accumulation progress tracking.
   Enhanced backoff warning with current sleep duration.

Fixes: #1113
- Wire dynamic_commit_row_threshold and dynamic_commit_byte_threshold
  from global settings via getSettingsRef() so the knobs are live.
  Previous static constexpr values made the settings dead code.

- Cap worker-lambda commit retries at MAX_COMMIT_RETRIES=10 (~20s).
  The old infinite retry loop pinned a pool thread under TOO_MANY_PARTS
  failures, saturating the commit pool and causing the liveness issue
  called out in #1113. On give-up we log the failure and call
  progressSequences() so the producer is not blocked forever.

- Add explicit JSON/non-JSON block boundary check before mergeBlocks().
  Previously a non-JSON block followed by a JSON block of the same
  schema version was silently merged into a thresholded JSON batch.
  Now we detect the transition and flush the current block first,
  keeping JSON and non-JSON commit paths strictly separated.
progressSequences() was called unconditionally after the retry loop,
which meant that if all MAX_COMMIT_RETRIES attempts failed the committed
SN would still advance — causing commitSNLocal() to persist a sequence
number for data that was never written to disk.

Now we track success with a bool flag and only call progressSequences()
on the success path. On the failure path we log the dropped range and
release the pool thread without acknowledging the sequence, so NativeLog
will replay the missing SN range on the next restart.

Addresses review feedback from @yokofly.
When all MAX_COMMIT_RETRIES attempts fail, the previous code only logged
and returned, leaving moved_seq stuck at the front of outstanding_sns.
Because progressSequencesWithLockHeld() only pops the front when seq ==
outstanding_sns.front(), later successful commits would accumulate in
local_committed_sns forever and last_sn would never advance — wedging
the entire shard.

Add failSequencesWithLockHeld() which:
  - pops the failed seq from outstanding_sns WITHOUT advancing last_sn
    (so the gap is preserved and NativeLog replays it on restart)
  - drains any contiguous ranges already sitting in local_committed_sns
    so subsequent successful commits can resume advancing last_sn

Wire it into the worker lambda failure path under sns_mutex.

Addresses review feedback from @yokofly.
Co-Authored-By: chatgpt-codex-connector[bot] <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
Co-Authored-By: chatgpt-codex-connector[bot] <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
Co-Authored-By: chatgpt-codex-connector[bot] <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
@yokofly yokofly closed this Mar 17, 2026
@yokofly yokofly deleted the fix/dynamic-json-oom-tiny-part-storm-backup branch March 17, 2026 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants