Skip to content

Comments

fix: Use get_index() instead of list_indexes() in has_collection() to…#19238

Merged
tjbck merged 2 commits intoopen-webui:devfrom
shargyle:fix/s3vectors-pagination-has-collection
Nov 19, 2025
Merged

fix: Use get_index() instead of list_indexes() in has_collection() to…#19238
tjbck merged 2 commits intoopen-webui:devfrom
shargyle:fix/s3vectors-pagination-has-collection

Conversation

@shargyle
Copy link
Contributor

@shargyle shargyle commented Nov 17, 2025

fix: Use get_index() instead of list_indexes() in has_collection() to handle pagination

Fixes #19233

Replace list_indexes() pagination scan with direct get_index() lookup
in has_collection() method. The previous implementation only checked
the first ~2,000 indexes due to unhandled pagination, causing RAG
queries to fail for indexes beyond the first page.

Benefits:

  • Handles buckets with any number of indexes (no pagination needed)
  • ~8x faster (0.19s vs 1.53s in testing)
  • Proper exception handling for ResourceNotFoundException
  • Scales to millions of indexes

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. Not targeting the dev branch will lead to immediate closure of the PR.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Not applicable - this is a bug fix with no user-facing configuration changes.
  • Dependencies: No new dependencies added.
  • Testing: Performed manual testing in production environment with 1,100+ S3 Vectors indexes. Verified that has_collection() now correctly finds indexes beyond position 2,000, and RAG queries work as expected. Tested both successful lookups and ResourceNotFoundException handling.
  • Agentic AI Code: This PR was written with AI assistance but has gone through thorough human review and extensive manual testing in a production environment with 2,000+ indexes.
  • Code review: Self-review performed, ensuring code follows project standards and handles all exception cases properly.
  • Title Prefix: Using "fix:" prefix as this corrects a bug in the S3 Vectors implementation.

Changelog Entry

Description

Fixed pagination bug in S3 Vectors has_collection() method that caused RAG queries to fail when the bucket contained more than ~2,000 indexes. The method now uses direct get_index() lookup instead of scanning list_indexes() results, eliminating pagination issues and improving performance by ~8x.

Added

  • N/A

Changed

  • Modified has_collection() method in backend/open_webui/retrieval/vector/dbs/s3vector.py to use direct get_index() lookup instead of list_indexes() scan
  • Improved exception handling to specifically catch ResourceNotFoundException for non-existent indexes

Deprecated

  • N/A

Removed

  • N/A

Fixed

  • Fixed S3 Vectors has_collection() returning False for indexes beyond position ~2,000 due to unhandled pagination
  • Fixed RAG queries failing with "Collection does not exist" warnings for newly uploaded files when bucket contains 2,000+ total indexes
  • Fixed performance bottleneck caused by scanning entire list of indexes instead of direct lookup

Security

  • N/A

Breaking Changes

  • N/A

Additional Information

Root Cause:
The original implementation used list_indexes() which returns only ~2,000 indexes by default with a nextToken for pagination. Without pagination handling, indexes beyond the first page were never found, causing RAG functionality to break in production environments with many files.

Testing Details:

Successfully reproduced in dev environment with 2,014 total indexes:

Key Finding: list_indexes() returns results in alphabetical order

  • Returns approximately ~1,637 indexes per page (default page size)
  • Indexes beyond position 1,637 require pagination (nextToken)

Reproduction Steps:

  1. Create 2,000+ dummy indexes to push real file-* indexes beyond the pagination threshold:

    # Use prefix "0000-test-{timestamp}-{number}"
    # Numbers come first alphabetically, pushing file-* indexes to page 2
    for i in range(2000):
        client.create_index(
            vectorBucketName=bucket_name,
            indexName=f"0000-test-{timestamp}-{i:05d}",
            dataType="float32",
            dimension=3072,
            distanceMetric="cosine"
        )

    Why this works: In a real scenario with 2,000 uploaded files, newer files would similarly be pushed beyond the pagination threshold, replicating production behavior.

  2. Upload a new file via Open WebUI:

    • Navigate to Workspace → Knowledge → Upload Files
    • Upload any test document
    • ✅ File uploads successfully, index created
  3. Attempt to query the file:

    • Create new chat
    • Ask a question about the uploaded file content
    • Result: "No sources found"
  4. Check logs:

    WARNING | Collection 'file-{uuid}' does not exist
    

    Even though AWS CLI confirms the index exists:

    aws s3vectors get-index --index-name file-{uuid} --vector-bucket-name {bucket}
    # Returns: Index found with correct metadata

Results:

  • Before fix: list_indexes() 1.53s, ❌ returns False for indexes beyond position ~1,637, RAG queries fail
  • After fix: get_index() 0.19s, ✅ finds all indexes regardless of position, ~8x faster

Related Issue: #19233

Screenshots or Videos

N/A - This is a backend bug fix with no UI changes. Testing was performed via:

  1. File upload through Open WebUI interface
  2. Log analysis showing successful index creation and retrieval
  3. RAG query verification showing correct document retrieval
  4. AWS CLI verification of index existence

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

… handle pagination

Fixes open-webui#19233

  Replace list_indexes() pagination scan with direct get_index() lookup
  in has_collection() method. The previous implementation only checked
  the first ~1,000 indexes due to unhandled pagination, causing RAG
  queries to fail for indexes beyond the first page.

  Benefits:
  - Handles buckets with any number of indexes (no pagination needed)
  - ~8x faster (0.19s vs 1.53s in testing)
  - Proper exception handling for ResourceNotFoundException
  - Scales to millions of indexes
@pr-validator-bot
Copy link

👋 Welcome and Thank You for Contributing!

We appreciate you taking the time to submit a pull request to Open WebUI!

⚠️ Important: Testing Requirements

We've recently seen an increase in PRs that have significant issues:

  • PRs that don't actually fix the bug they claim to fix
  • PRs that don't implement the feature they describe
  • PRs that break existing functionality
  • PRs that are clearly AI-generated without proper testing being done by the author
  • PRs that simply don't work as intended

These untested PRs consume significant time from maintainers and volunteer contributors who review and test PRs in their free time.
Time that could be spent testing other PRs or improving Open WebUI in other ways.

Before marking your PR as "Ready for Review":

Please explicitly confirm:

  1. ✅ You have personally tested ALL changes in this PR
  2. How you tested it (specific steps you took to verify it works)
  3. Visual evidence where applicable (screenshots or videos showing the feature/fix working) - if applicable to your specific PR

If you're not certain your PR works exactly as intended, please leave it in DRAFT mode until you've thoroughly tested it.

Thank you for helping us maintain quality and respecting the time of our community! 🙏


⚠️ WARNING

You are trying to merge to the main branch!

This repository does not allow direct merges to the main branch! Please retarget your PR to the dev branch ASAP or your PR will be closed!

@shargyle shargyle changed the base branch from main to dev November 17, 2025 18:58
@shargyle shargyle marked this pull request as draft November 17, 2025 19:05
@tjbck
Copy link
Contributor

tjbck commented Nov 17, 2025

@westbrook-ai review wanted here

@westbrook-ai
Copy link
Contributor

Hey @shargyle, at first glance these changes definitely make sense. Please ping me when the PR is ready for review and I'll do a more detailed test by connecting an Open WebUI instance to S3 Vectors. Thanks!

Unneeded exception handling removed to match original OWUI code
@shargyle
Copy link
Contributor Author

Validation Results: S3 Vector Embedding Retrieval Fix

Before Fix

Prior to the fix and with 2,000+ indexes present in the S3 vector bucket, the embedding retrieval failed.

Screenshot showing failed embedding retrieval

After Fix

After implementing the fix included in this PR, the same file was uploaded in a new chat, and the embedding
retrieval succeeded.

Screenshot showing successful embedding retrieval

Log Evidence

The logs confirm successful vector search operations:

13:49:27.266 UTC | INFO
Searching collection file-92cf1db1-a574-4c7d-9d23-3b98c6fc6779 with 1 query vectors, limit=10
Source: open_webui.retrieval.vector.dbs.s3vector:search:316

13:49:27.432 UTC | INFO
Search completed. Found results for 1 queries
Source: open_webui.retrieval.vector.dbs.s3vector:search:382

@shargyle shargyle marked this pull request as ready for review November 18, 2025 21:00
@shargyle
Copy link
Contributor Author

Hey @shargyle, at first glance these changes definitely make sense. Please ping me when the PR is ready for review and I'll do a more detailed test by connecting an Open WebUI instance to S3 Vectors. Thanks!

@westbrook-ai I just completed testing our own instance of OWUI w/an S3 vector bucket that has more than 2,000 indexes--the change successfully fixed the issue I reported.

@westbrook-ai
Copy link
Contributor

Amazing, thanks - I should be able to set up a test within the next couple days, will report back if I find any issues.

@westbrook-ai
Copy link
Contributor

@tjbck confirmed that this PR is good to go, LGTM. Thank you for the fix @shargyle!

@tjbck
Copy link
Contributor

tjbck commented Nov 19, 2025

Thanks everyone!

@tjbck tjbck merged commit 720af63 into open-webui:dev Nov 19, 2025
abedrg added a commit to abedrg/open-webui that referenced this pull request Nov 24, 2025
* revert/fix: edit valves modal

* chore: Update CHANGELOG for version 0.6.35 (open-webui#18481)

* chore: Update CHANGELOG for version 0.6.35

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG with recent feature additions

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* chore: format

* refac

* refac

* fix(chats): fix chat search crash (open-webui#18576)

* fix(chats): handle null bytes in PostgreSQL search

Removes null bytes from message content before performing
case-insensitive search in PostgreSQL, preventing conversion
errors and ensuring reliable query results.

* fix(chats): prevent null byte errors in PostgreSQL queries

Ensures chat content and titles containing null bytes are excluded from PostgreSQL text queries to avoid conversion errors.

Improves reliability of search and filtering by handling problematic characters in JSON fields.

* refac: shortcuts

* refac

* refac

* refac

* fix: image edit workflow editor

* fix: firecrawl import

* refac

* fix: Socket.IO CORS warning

Co-Authored-By: Gero Doll <[email protected]>

* feat: add OAUTH_GROUPS_SEPARATOR for configurable group parsing

* fix: tool calling

* fix: Shortcuts Modal i18n

* chore: bump

* chore: CHANGELOG 0.6.36

* chore: format

* chore: format

* Update catalan translation.json

* Update catalan translation.json

* i18n: improve Chinese translation

* feat: handle large stream chunks responses

* feat: Allow configuration of not process large single-line data

* Update translation.json (pt-BR)

New translations have been made of the new items that were added in the latest version.

* fix: Handle AttributeError in hybrid search with reranking (open-webui#17046)

- Split attribute existence checks from document content checks
- Added hasattr() check for metadatas attribute
- Prevents AttributeError when collection_result is missing attributes
- Maintains all original validation logic

Fixes open-webui#17046

* perf Optimize Socket Emits Using User Rooms (open-webui#18996)

* This PR optimizes socket delta event broadcasting by leveraging rooms. Instead of iterating through a user's sessions and emitting events individually, this change sends a single event to a user-specific room. This approach is more efficient, reducing overhead and improving performance, particularly for users with multiple concurrent sessions.

In testing this dramatically reduces emits and server load.

* Update main.py

Added userroom join

---------

Co-authored-by: Tim Baek <[email protected]>

* Update fi-FI translation.json

Improved and added missing translations.

* Upd: i18n_ es-ES Spanish Translation v0.6.36

### UPD Spanish Translation v0.6.35

Added new strings

* refactor: Remove unused litellm endpoint and associated frontend code

Removes the unused `/litellm/config` endpoint, the corresponding `downloadLiteLLMConfig` frontend API function, and the unused import from the `Database.svelte` component. This code was identified as dead code as it was not being used in the UI.

* refac: suggestions display full name on hover

* enh: optionally add user headers external websearch

Co-Authored-By: Classic298 <[email protected]>

* refac

* refac

* refac: batch file processing

Co-Authored-By: Sihyeon Jang <[email protected]>

* refac

* refac

* refac: stream chunk max buffer size

* refac: rerank

* fix: images edit openai base url/key save issue

* refac: get event emitter/caller

* i18n - Update ie-GA translation

* Fetched user_group_ids prior to looping through models with has_access to reduce DB hits for group membership

* refac/fix: rag template placeholder substitution

* refac: rm redundant query tag

* refac/fix: mineru params

breaking change

* feat(i18n): fill in missing Farsi translations

* fix: Duplicate instructions in tool selection calling prompt (open-webui#19122)

* Fix duplicated query prefix in user prompt for function calling

* Fix duplicated last user message in prompt for function calling

* Feat: optionally disable password login endpoints (open-webui#19113)

* Implement message cleaning before API call

* Filter out empty assistant messages before cleaning

* Update catalan translation.json (open-webui#29)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* Update main.py

* Update auths.py

* Update Chat.svelte

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix verify mcp connection with oauth type (open-webui#19149)

* feat: Add custom API endpoint and user info headers for Perplexity Search (open-webui#31) (open-webui#19147)

Co-authored-by: Claude <[email protected]>

* Fix: Handle empty strings in OAuth registration response (open-webui#19144)

- The mcp package requires optional unset values to be None. If an empty string is passed, it gets validated and fails.
- Replace all empty strings with None.

* enh/refac: enable autocompletion for non rich text input

* refac/fix

* enh: custom headers for external tool servers

* chore: bump unstructured to 0.18.18

* refac: chat tag suggestions behaviour

* enh: text select copy behaviour

* Updated Swedish translation (open-webui#19161)

Refined existing swedish translations and added most of the missing ones.

* refac: oauth pass client auth params

* refac: pass token_endpoint_auth_method

* Updated Danish translations (open-webui#19174)

* make path to audit log configurable (open-webui#19173)

* fix: docling params issue

* Add Azure Search (open-webui#19104)

Co-authored-by: Tim Baek <[email protected]>

* refac

* wip: requirements-min

* refac

* refac: decouple api key restrictions from get user

* enh: copy table

* chore: format

* feat: voice mode prompt template

* chore: dep

* refac: background image styling behaviour

* refac

* refac/fix: automatic1111 params

* refac/fix

* refac/fix

* refac/fix

* Update translation.json (open-webui#19213)

* fix(images): correct config key for image edit engine (open-webui#19200)

Updates conditional to reference the appropriate configuration property for image editing, ensuring proper engine selection.

* refac/enh: web search domain allow/block filter

* refac

* fix: UserValves contamination between multiple tools

Co-Authored-By: Daniel Pots <[email protected]>

* refac/sec: sanitize note pdf download

* refac

* refac/fix: inherit model stream_response setting

* refac

* refac: group members table db migration

* refac: group members backend

* refac: group members frontend

* feat: add a metric to monitor daily unique users (open-webui#19236)

open-webui#19234

* Update MCP Oauth server metadata discovery order (open-webui#19244)

* feat: add granular import/export permissions for workspace items (open-webui#19242)

* feat: add granular import/export permissions for workspace items (open-webui#55)

Co-authored-by: Claude <[email protected]>

* Fix permissions toggles not saving in EditGroupModal (open-webui#58)

Co-authored-by: Claude <[email protected]>

* Fix permissions toggles not saving in EditGroupModal (open-webui#59)

Co-authored-by: Claude <[email protected]>

---------

Co-authored-by: Claude <[email protected]>

* refac: group members frontend integration

* refac: styling

* refac: styling

* feat: pgvector hnsw index type (open-webui#19158)

* Adding hnsw index type for pgvector, allowing vector dimensions larger than 2000

* remove some variable assignments

* Make USE_HALFVEC variable configurable

* Simplify USE_HALFVEC handling

* Raise runtime error if the index requires rebuilt

---------

Co-authored-by: Moritz <[email protected]>

* feat/security: Add SSRF protection with configurable blocklist

Co-Authored-By: Classic298 <[email protected]>

* refac

* refac: styling

* obfuscate TTS elevenlabs api key (open-webui#19262)

* refac: mineru api key required behaviour

* refac: styling

* refac

* feat: Adding file metadata to hybrid search (open-webui#19095)

* Added metadata to hybrid search

* And config and env plus refac

* consistency

---------

Co-authored-by: Tim Baek <[email protected]>

* refac/enh: create new note

* fix: Use get_index() instead of list_indexes() in has_collection() to… (open-webui#19238)

* fix: Use get_index() instead of list_indexes() in has_collection() to handle pagination

Fixes open-webui#19233

  Replace list_indexes() pagination scan with direct get_index() lookup
  in has_collection() method. The previous implementation only checked
  the first ~1,000 indexes due to unhandled pagination, causing RAG
  queries to fail for indexes beyond the first page.

  Benefits:
  - Handles buckets with any number of indexes (no pagination needed)
  - ~8x faster (0.19s vs 1.53s in testing)
  - Proper exception handling for ResourceNotFoundException
  - Scales to millions of indexes

* Update s3vector.py

Unneeded exception handling removed to match original OWUI code

* feat: Add adjustable text size setting to interface (open-webui#19186)

* Add adjustable text size setting to interface

Introduces a user-configurable text size (scale) setting, accessible via a slider in the interface settings. Updates CSS and Sidebar chat item components to respect the new --app-text-scale variable, and persists the setting in the store. Adds related i18n strings and ensures the text scale is applied globally and clamped to allowed values.

* Refactor text scale logic into utility module

Moved all text scale related constants and functions from components and stores into a new utility module (src/lib/utils/text-scale.ts). Updated imports and usage in Interface.svelte and index.ts to use the new module, improving code organization and reusability.

* Adjust sidebar chat scaling without extra classes

keep sidebar markup using existing Tailwind utility classes so chat items render identically pre-feature
move all text-scale sizing into app.css under the #sidebar-chat-item selectors
change the root font-size multiplier to use 1rem instead of an explicit 16px so browser/user preferences propagate

* Update Switch.svelte

Adjust toggles from fixed pixel to rem to scale with the text size

* Update Interface.svelte

Updated label from 'Text Scale' to 'UI Scale'.
Added padding around slider

* Update app.css

Added comments

* enh: images openai api params

* enh/feat: persist folder state

Co-Authored-By: G30 <[email protected]>

* Add additional config elements to control how engineio and redis log and interact. (open-webui#19091)

* feat/enh: api keys user permission

breaking change, `ENABLE_API_KEY` renamed to `ENABLE_API_KEYS` and disabled by default and must be explicitly toggled on.

* feat: Add image handling in middleware for delta updates (open-webui#19073)

* feat: Add image handling in middleware for delta updates

* refactor: optimize the code logic

* refac

* chore: mcp bump

* refac/enh: mcp oauth auth method support

* refac: models endpoint

* refac

Co-Authored-By: G30 <[email protected]>

* refac

Co-Authored-By: G30 <[email protected]>

* chore: format

* refac: styling

* refac: rm ai slop

* refac

* refac

* refac

* feat: default pinned models

Co-Authored-By: Classic298 <[email protected]>

* i18n: improve Chinese translation (open-webui#19285)

* enh: revoked token handling

* refac

* refac

* refac

* refac: add reasoning_effort to azure supported params

* feat: allow flat claims instead of nested claims as alternative (open-webui#19286)

* i18n: improve Chinese translation (open-webui#19309)

* enh/pref: convert markdown base64 images to urls

Co-Authored-By: Shirasawa <[email protected]>

* refac/enh: unregisterServiceWorkers on update

* Support folder drag-n-drop (open-webui#19320)

* feat: Add user header information for TTS/STT requests (open-webui#93) (open-webui#19323)

Resolves open-webui#19312

Co-authored-by: Claude <[email protected]>

* refac: feedback list optimisation

* refac/fix: styling

* feat/enh: optional password validation

* feat: Add default group assignment for new users (open-webui#94) (open-webui#19325)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: Claude <[email protected]>

* refac: styling

* feat/enh: user sharing perms

* refac

* feat/enh: group share setting

* feat: add support for Weaviate vector database (open-webui#14747)

* chore: dep

* refac/enh: dedicated enable image edit toggle

* refac: styling

* refac: profile_image_url optimization

* Korean update (open-webui#19336)

* i18n: improve Chinese translation (open-webui#19334)

* refac

* fix: format date according to DEFAULT_LOCALE in chat search (open-webui#19305)

* fix: localized format

* load default_locale from backend

* fix: add missing i18n import to fix build (open-webui#19337)

* refac: styling

* Update Catalan translation.json (open-webui#19338)

* refac/pref: chat import optimization

Co-Authored-By: G30 <[email protected]>

* refac

* refac/fix: openai edit multiple images

* refac

* enh: clone system models

Co-Authored-By: G30 <[email protected]>

* refac

* refac

* fix(i18n): correct Thai translation in sidebar (open-webui#19363)

* Update translation.json (open-webui#19364)

* refac

* refac

* fix: translation

* refac: search chat postgres

* fix(i18n): comprehensive revision and improvement of all Thai translations across the app (open-webui#19377)

* Update translation.json (pt-BR) (open-webui#19384)

new translations of the newly added items

* refac/fix: chat search null byte filter

* refac: clean null bytes on load

* perf: 50x performance improvement for external embeddings (open-webui#19296)

* Update utils.py (open-webui#77)

Co-authored-by: Claude <[email protected]>

* refactor: address code review feedback for embedding performance improvements (open-webui#92)

Co-authored-by: Claude <[email protected]>

* fix: prevent sentence transformers from blocking async event loop (open-webui#95)

Co-authored-by: Claude <[email protected]>

---------

Co-authored-by: Claude <[email protected]>

* refac

* refac

* refac: models workspace optimization

* feat/enh: move chats in folder on delete

Co-Authored-By: expruc <[email protected]>

* refac: rm folder id on chat archive

* chore (open-webui#19389)

* Upd:i18n es-ES_Spanish Translation_v0.6.37 (open-webui#19388)

* Upd:i18n es-ES_Spanish Translation_v0.6.37

### es-ES Spanish Translation v0.6.37

Added new strings.

* Corrected string

* refac

* refac

* refac

* refac

* chore: user header forward minimize code changes throughout codebase (open-webui#19392)

* Update external.py

* remove unused imports

* Update ollama.py

* Update ollama.py

* Update ollama.py

* Update openai.py

* chore: google-genai bump

* chore: Update README (open-webui#19398)

* refac: disable single tilde

* refac: sources and citations

* refac

* refac

* enh: group members selector

* refac

* fix: kokorojs tts

* refac

* refac

* refac/fix: refresh folder chat list

* refac: folder page chat list

* chore: format

* refac

* chore: CHANGELOG 0.6.37 (open-webui#19126)

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* refac

* refac

* refac: styling

* refac: prompt suggestions component

Co-Authored-By: Classic298 <[email protected]>

* refac

* refac

* refac: styling

* chore: format

* refac: styling

* refac

* refac: styling

* refac

* chore: format

* i18n: improve Chinese translation

* fix: hybrid search

* fix

* refac/fix: oauth

* fix: tool server save error handling

* chore: bump

* doc: changelog

* Update docker-build.yaml

* refac

---------

Co-authored-by: Timothy Jaeryang Baek <[email protected]>
Co-authored-by: Classic298 <[email protected]>
Co-authored-by: Davixk <[email protected]>
Co-authored-by: Gero Doll <[email protected]>
Co-authored-by: Adam M. Smith <[email protected]>
Co-authored-by: EntropyYue <[email protected]>
Co-authored-by: Aleix Dorca <[email protected]>
Co-authored-by: Shirasawa <[email protected]>
Co-authored-by: joaoback <[email protected]>
Co-authored-by: krishna-medapati <[email protected]>
Co-authored-by: Adam Skalicky <[email protected]>
Co-authored-by: Kylapaallikko <[email protected]>
Co-authored-by: _00_ <[email protected]>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: Sihyeon Jang <[email protected]>
Co-authored-by: Aindriú Mac Giolla Eoin <[email protected]>
Co-authored-by: Adam Skalicky <[email protected]>
Co-authored-by: amir ahrari <[email protected]>
Co-authored-by: Mati <[email protected]>
Co-authored-by: Oleg Yermolenko <[email protected]>
Co-authored-by: Claude <[email protected]>
Co-authored-by: xqqp <[email protected]>
Co-authored-by: Siavash Vatanijalal <[email protected]>
Co-authored-by: Jeppe Kuhlmann Andersen <[email protected]>
Co-authored-by: Mikael Schirén <[email protected]>
Co-authored-by: Sang Lê <[email protected]>
Co-authored-by: Daniel Pots <[email protected]>
Co-authored-by: FlorentMair80 <[email protected]>
Co-authored-by: logan-hcg <[email protected]>
Co-authored-by: lazariv <[email protected]>
Co-authored-by: Moritz <[email protected]>
Co-authored-by: Tom Haynes <[email protected]>
Co-authored-by: Jacob Leksan <[email protected]>
Co-authored-by: Seth Argyle <[email protected]>
Co-authored-by: davecrab <[email protected]>
Co-authored-by: G30 <[email protected]>
Co-authored-by: gerhardj-b <[email protected]>
Co-authored-by: Shirasawa <[email protected]>
Co-authored-by: Blake <[email protected]>
Co-authored-by: Diwakar <[email protected]>
Co-authored-by: Cyp <[email protected]>
Co-authored-by: Danny Liu <[email protected]>
Co-authored-by: Siwadon S. (Jay) <[email protected]>
Co-authored-by: expruc <[email protected]>
puffinjiang pushed a commit to puffinjiang/open-webui that referenced this pull request Dec 9, 2025
open-webui#19238)

* fix: Use get_index() instead of list_indexes() in has_collection() to handle pagination

Fixes open-webui#19233

  Replace list_indexes() pagination scan with direct get_index() lookup
  in has_collection() method. The previous implementation only checked
  the first ~1,000 indexes due to unhandled pagination, causing RAG
  queries to fail for indexes beyond the first page.

  Benefits:
  - Handles buckets with any number of indexes (no pagination needed)
  - ~8x faster (0.19s vs 1.53s in testing)
  - Proper exception handling for ResourceNotFoundException
  - Scales to millions of indexes

* Update s3vector.py

Unneeded exception handling removed to match original OWUI code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

issue: S3 Vectors has_collection() fails for indexes beyond first page due to missing pagination

4 participants