Skip to content

Conversation

@Eyal-Danieli
Copy link
Member

@Eyal-Danieli Eyal-Danieli commented Dec 29, 2025

Optimize the drift-over-time API for V3IO TSDB. The main issue was that we were iterating over the data multiple times after retrieving it, even though this can be done in a single pass. We also added some optimization in the code logic (see more details below).


🛠️ Changes Made

  • Use integers representing nanoseconds instead of datetime objects for bucketing. This also means working in nanoseconds rather than microseconds for better performance during this phase.
  • Filter out invalid data early, before any processing. Previously, we processed everything first and filtered later. With the new design, we avoid doing unnecessary work.
  • Reduce multiple separate iterations over the data into a single pass that aggregates directly while iterating.
  • Perform datetime conversions only at the very end, after all filtering and processing is done.
  • Create fewer objects, reducing memory allocations.

✅ Checklist

  • I updated the documentation (if applicable)
  • I have tested the changes in this PR
  • I confirmed whether my changes are covered by system tests
    • If yes, I ran all relevant system tests and ensured they passed before submitting this PR
    • I updated existing system tests and/or added new ones if needed to cover my changes
  • If I introduced a deprecation:

🧪 Testing

  • TestMonitoringAppFlow passed (includes testing for drift over time values).

🔗 References


🚨 Breaking Changes?

  • Yes (explain below)
  • No

🔍️ Additional Notes

@Eyal-Danieli Eyal-Danieli requested a review from a team as a code owner December 29, 2025 15:01
Copy link
Contributor

@danielperezz danielperezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great :)
Had two comments

return mm_schemas.ModelEndpointDriftValues(values=values)

@staticmethod
def _convert_drift_data_to_values(
Copy link
Contributor

@danielperezz danielperezz Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function can be removed now

Comment on lines 1631 to 1638
if bucket_start_ns not in bucket_endpoint_status:
bucket_endpoint_status[bucket_start_ns] = {}

# Update max status for this endpoint in this bucket
if endpoint_id not in bucket_endpoint_status[bucket_start_ns]:
bucket_endpoint_status[bucket_start_ns][endpoint_id] = status
elif status > bucket_endpoint_status[bucket_start_ns][endpoint_id]:
bucket_endpoint_status[bucket_start_ns][endpoint_id] = status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified using defaultdict and get:

Suggested change
if bucket_start_ns not in bucket_endpoint_status:
bucket_endpoint_status[bucket_start_ns] = {}
# Update max status for this endpoint in this bucket
if endpoint_id not in bucket_endpoint_status[bucket_start_ns]:
bucket_endpoint_status[bucket_start_ns][endpoint_id] = status
elif status > bucket_endpoint_status[bucket_start_ns][endpoint_id]:
bucket_endpoint_status[bucket_start_ns][endpoint_id] = status
bucket_endpoint_status = defaultdict(dict) # place above instead of the current initializtion
bucket = bucket_endpoint_status[bucket_start_ns]
bucket[endpoint_id] = max(bucket.get(endpoint_id, status), status)

@assaf758 assaf758 merged commit fb236f0 into mlrun:development Dec 31, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants