Skip to content

Conversation

@alxtkr77
Copy link
Member

@alxtkr77 alxtkr77 commented Jan 7, 2026

Summary

Fixed a critical issue where deploying serving functions with many models (e.g., 5000) would cause:

  • Deploy API requests to timeout (nginx 180s proxy-read-timeout)
  • UI to become unresponsive during deployment
  • Server appeared to complete processing but response never reached client

Root Cause

The _create_model_endpoint_limited async function was calling run_function_with_new_db_session() synchronously, which blocked the event loop while creating model endpoints in the background task.

Changes Made

  • Wrapped the sync DB operation with run_in_threadpool to execute it in a thread pool worker without blocking the event loop

Testing

  • Verified with 5000 model deployment test
  • Deploy API now responds in ~7s instead of timing out
  • UI remains responsive during background model endpoint creation

Reference

  • Jira: ML-11826

…tion (ML-11826)

The `_create_model_endpoint_limited` async function was calling
`run_function_with_new_db_session()` synchronously, which blocked
the event loop while creating model endpoints.

This caused:
- Deploy API responses to timeout (nginx 180s proxy-read-timeout)
- UI to become unresponsive during deployment
- Server completed in ~7s but response never reached client

Fix: Wrap the sync DB operation with `run_in_threadpool` to execute
it in a thread pool worker without blocking the event loop.
@alxtkr77 alxtkr77 merged commit 6145720 into mlrun:development Jan 8, 2026
13 checks passed
assaf758 pushed a commit that referenced this pull request Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants