Skip to content

Comments

perf: optimize /api/models endpoint performance using batched loading#20010

Closed
Classic298 wants to merge 1 commit intoopen-webui:devfrom
Classic298:perf-models-api
Closed

perf: optimize /api/models endpoint performance using batched loading#20010
Classic298 wants to merge 1 commit intoopen-webui:devfrom
Classic298:perf-models-api

Conversation

@Classic298
Copy link
Collaborator

Fixes: #20004, #18950

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. Not targeting the dev branch will lead to immediate closure of the PR.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: If necessary, update relevant documentation Open WebUI Docs like environment variables, the tutorials, or other documentation sources.
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description.
  • Agentic AI Code: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review AND manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR.
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • feat: Introduces a new feature or enhancement to the codebase

Changelog Entry

Description

Optimizes /api/models endpoint performance by replacing sequential database queries with batch loading. The get_filtered_models() functions were calling Models.get_model_by_id() in a loop for each model, creating N database queries and loading heavy meta fields containing base64 profile images. This change reduces database queries from N to 1 and excludes unnecessary image data from access control checks.

Changed

  • Added ModelAccessControl response model containing only id, user_id, and access_control fields
  • Added get_models_access_control_by_ids() batch loading method to ModelsTable class
  • Updated get_filtered_models() in backend/open_webui/utils/models.py to use batch loading
  • Updated get_filtered_models() in backend/open_webui/routers/openai.py to use batch loading
  • Updated get_filtered_models() in backend/open_webui/routers/ollama.py to use batch loading

Fixed

Security

  • [List any new or updated security-related changes, including vulnerability fixes]

Breaking Changes

  • BREAKING CHANGE: [List any breaking changes affecting compatibility or functionality]

Additional Information

  • Database queries reduced from N (number of models) to 1 batch query
  • Meta field with base64 images excluded from access control queries
  • Significantly faster response times for endpoints with many models

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

@pr-validator-bot
Copy link

👋 Welcome and Thank You for Contributing!

We appreciate you taking the time to submit a pull request to Open WebUI!

⚠️ Important: Testing Requirements

We've recently seen an increase in PRs that have significant issues:

  • PRs that don't actually fix the bug they claim to fix
  • PRs that don't implement the feature they describe
  • PRs that break existing functionality
  • PRs that are clearly AI-generated without proper testing being done by the author
  • PRs that simply don't work as intended

These untested PRs consume significant time from maintainers and volunteer contributors who review and test PRs in their free time.
Time that could be spent testing other PRs or improving Open WebUI in other ways.

Before marking your PR as "Ready for Review":

Please explicitly confirm:

  1. ✅ You have personally tested ALL changes in this PR
  2. How you tested it (specific steps you took to verify it works)
  3. Visual evidence where applicable (screenshots or videos showing the feature/fix working) - if applicable to your specific PR

If you're not certain your PR works exactly as intended, please leave it in DRAFT mode until you've thoroughly tested it.

Thank you for helping us maintain quality and respecting the time of our community! 🙏

@Classic298
Copy link
Collaborator Author

@silentoplayz very general simple testing wanted if this works thx

@silentoplayz
Copy link
Collaborator

@silentoplayz very general simple testing wanted if this works thx

Models API Optimization Verification

Changes Verified

  • Database Queries: Reduced from N+1 to ~2 per request by implementing get_models_access_control_by_ids for batch fetching.
  • Access Control: Verified that visibility rules (Owner, Public, Group Shared, User Shared) are maintained despite the optimization.

Verification Process

Two Python scripts were created to verify the changes:

  • tests/bench_models_performance.py: Measures execution time and query count for differing numbers of models (10, 100, 500).
  • tests/verify_models_access.py: Functional test to ensure privacy and sharing settings work as expected.

Results

Performance Benchmark

The benchmark demonstrates that the number of database queries remains constant regardless of the number of models, confirming the O(1) query optimization.

# Models Time (s) Queries Executed
10 0.0032 2
100 0.0017 2
500 0.0051 2

Note: Results obtained from tests/bench_models_performance.py.

Access Control Verification

The functional tests passed significantly, confirming that:

  • Owners can see their private models.
  • Users cannot see private models of others.
  • Group-shared models are visible to group members.
  • User-shared models are visible to specific users.

Conclusion

The optimization effectively reduces database load without compromising security or functionality.

@Classic298
Copy link
Collaborator Author

That was a splendid review! Thank you @silentoplayz

@Classic298 Classic298 marked this pull request as ready for review December 17, 2025 19:33
@Classic298
Copy link
Collaborator Author

@tjbck

@Classic298 Classic298 requested a review from tjbck December 20, 2025 16:04
tjbck added a commit that referenced this pull request Dec 21, 2025
@tjbck
Copy link
Contributor

tjbck commented Dec 21, 2025

Closing in favour of 0dd2cfe!

@tjbck tjbck closed this Dec 21, 2025
@Classic298 Classic298 deleted the perf-models-api branch December 21, 2025 11:43
rizkiramadhan2 pushed a commit to rizkiramadhan2/open-webui that referenced this pull request Jan 24, 2026
rizkiramadhan2 pushed a commit to rizkiramadhan2/open-webui that referenced this pull request Jan 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants