Optimize search performance for large libraries#3476
Merged
OzzieIsaacs merged 2 commits intojaneczku:Developfrom Dec 6, 2025
Merged
Optimize search performance for large libraries#3476OzzieIsaacs merged 2 commits intojaneczku:Developfrom
OzzieIsaacs merged 2 commits intojaneczku:Developfrom
Conversation
This commit introduces several performance optimizations to the search
system, significantly reducing query times for large Calibre libraries.
Key improvements:
1. FTS5 Integration
- Added FTS5 full-text search support with automatic fallback
- Uses indexed search when available, providing sub-second results
- Gracefully degrades to traditional search if FTS5 is unavailable
2. Query Optimization
- Replaced expensive .any() subqueries with efficient JOIN-based
subqueries for tags, series, authors, and publishers
- Reduced SQL complexity and improved query planning
- Added selectinload() for authors to prevent N+1 query problems
3. LIMIT+1 Pattern
- Implemented LIMIT+1 estimation pattern in get_search_results()
- Avoids expensive COUNT(*) operations on large result sets
- Provides fast pagination without sacrificing accuracy
4. Author Ordering Optimization
- Replaced nested database queries with O(1) dictionary lookups
- Eliminated N+1 query anti-pattern in order_authors()
- Reduced author sorting from O(n²) to O(n) complexity
Performance Impact:
In testing with a library of 129,000+ books, these optimizations reduced
search times from 3-9 seconds to 85-330ms, achieving 89-97% improvement
across different search types.
The changes maintain backward compatibility and include fallbacks for
environments without FTS5 support.
- Add FTS5 table existence check to avoid log spam on non-FTS databases - Escape FTS5 special characters (quotes) to prevent query errors - Wrap FTS5 search terms in quotes for phrase matching accuracy - Improve logging: change author ordering debug to warning for visibility - Add comment explaining author_sort data issues These changes improve robustness and security without affecting performance.
Contributor
Author
|
Related: #3468 |
|
I ran into the following issue: |
Collaborator
|
The PR has 2 big issues, related to each other: |
Collaborator
Contributor
|
@OzzieIsaacs part of this code need to be reverted, do I have green flag to open a PR to revert them? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
This PR introduces significant performance improvements to Calibre-Web's search functionality, particularly beneficial for large libraries. Search times are reduced from 3-9 seconds to under 330ms in most cases, representing a 89-97% performance improvement.
Problem Statement
Users with large Calibre libraries (100,000+ books) experience slow search operations due to:
query.count()calls generating deeply nested SQL.any()filters with OR conditions across multiple relationshipsSolution Overview
1. FTS5 Full-Text Search Integration
2. Optimized Subqueries
.any()subqueries with JOIN-based subqueries.any()generates expensive EXISTS clauses; JOINs are more efficientBooks.id.in_(subquery)pattern for tags, series, authors, publishers3. Eager Loading with selectinload()
.options(selectinload(Books.authors))to base query4. LIMIT+1 Estimation Pattern
5. Dictionary-Based Author Ordering
Performance Results
Testing environment:
Average improvement: 95% reduction in search time
Compatibility
Backward Compatibility
Requirements
selectinloadsupport (standard in supported versions)Testing Recommendations
Manual Testing
Related Issues
This addresses performance concerns raised by users with large libraries, particularly those with 50,000+ books where search becomes unusably slow.
Checklist
Additional Notes
The FTS5 integration assumes the standard Calibre FTS table structure. If users have custom FTS configurations, the fallback will handle those cases gracefully.
These optimizations are particularly impactful for:
The changes do not modify the search result quality or relevance - only the performance characteristics.