Skip to content

[ENH] Make dirty_log_collections metric mcmr-aware.#6353

Merged
rescrv merged 1 commit intomainfrom
rescrv/dirty-log-collections
Feb 13, 2026
Merged

[ENH] Make dirty_log_collections metric mcmr-aware.#6353
rescrv merged 1 commit intomainfrom
rescrv/dirty-log-collections

Conversation

@rescrv
Copy link
Copy Markdown
Contributor

@rescrv rescrv commented Feb 6, 2026

Description of changes

I'd like this metric to track total dirty collections. I got 30k dirty and it didn't move.

Test plan

Eyeballs.

Migration plan

N/A

Observability plan

Watch on staging.

Documentation Changes

N/A

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 6, 2026

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Copy Markdown
Contributor

propel-code-bot bot commented Feb 6, 2026

Make dirty_log_collections metric aggregate across topologies

Updates LogServer::roll_dirty_log so the dirty_log_collections gauge is recorded after rollups from every dirty-log source (local S3 and each topology) are merged. The metric update is removed from roll_dirty_log_s3, preventing per-source double counting and allowing the gauge to reflect the total number of unique dirty collections after deduplication.

Key Changes

• Moved self.metrics.dirty_log_collections.record(...) from roll_dirty_log_s3 to roll_dirty_log so the count uses the merged rollups map
• Ensured the gauge now represents the global dirty collection cardinality rather than only the local dirty log segment

Possible Issues

• If roll_dirty_log returns early on error before swapping need_to_compact, the gauge could remain stale because the record call occurs before the swap; verify desired ordering.

This summary was automatically generated by @propel-code-bot

}
self.metrics
.dirty_log_collections
.record(rollups.len() as u64, &[]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wont this overcount now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant you have more than one dirty log entry per collection

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say more? There can be more than one dirty log entry, but I'm logging post-dedupe.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're logging post-dedupe that makes sense

@rescrv rescrv force-pushed the rescrv/dirty-log-collections branch from 0d6209d to 8954a9d Compare February 13, 2026 17:26
@rescrv rescrv merged commit 0c9b7f0 into main Feb 13, 2026
67 checks passed
@rescrv rescrv deleted the rescrv/dirty-log-collections branch February 13, 2026 23:57
tanujnay112 added a commit that referenced this pull request Feb 18, 2026
- **[ENH]: Cache rust git submodules in mounted volume (#6424)**
- **[CHORE](k8s) increase dev CPU limits from 100m to 200-300m (#6435)**
- **[ENH] replace live cloud tests with k8s integration tests (#6434)**
- **[ENH] Make dirty_log_collections metric mcmr-aware. (#6353)**
- **[ENH] Quantized Spann Segment Writer (#6397)**
- **[ENH] Wire up quantized writer in compaction (#6399)**
- **[ENH] Quantized Spann Segment Reader (#6405)**
- **[ENH] Wire up quantized reader in new orchestrator (#6409)**
- **[ENH] Garbage collect usearch index files (#6416)**
- **[ENH] Trace quantized spann implementation (#6425)**
- **[ENH]: Precompute data chunk len() (#6442)**
- **[BUG]: Compaction version file flush was incomplete on MCMR
(#6423)**
- **[DOC]: Fixed broken links in Readme (#6440)**
- **[DOC] Fix link to Rust documentation (#6443)**
- **[ENH]: Allow users to disable FTS in schema (#6214)**

---------

Co-authored-by: Robert Escriva <[email protected]>
Co-authored-by: Macronova <[email protected]>
Co-authored-by: Nilpotent <[email protected]>
Co-authored-by: anderk222 <[email protected]>
Co-authored-by: Sanket Kedia <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants