WIP - (#6031) - faster IDB changes() using getAll() #6031
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've been trying to find ways to improve the performance of secondary indexes. Obviously we'd rather go with native secondary indexes, but that may be a ways off, and I think there are ways we can at least stop the bleeding for current slow paths in secondary indexes.
From profiling, I noticed that a large amount of time is spent using
IDBCursors in our IDBchanges()implementation. Cursors are slow because we effectively do a stair-step pattern pulling in allchanges()(e.g. the first 50 changes afterseq2), applying filters and limits and checkingdoc_idsand checking that the document is the winning doc and checking that the sequence is the winning sequence, etc. We do this using cursors because the filters often need to be applied in JavaScript and then checked against thelimitto ensure we return exactly the right results in exactly the right order.However, for the vast majority of cases (and for the cases that affect secondary indexes), the
changes()queries are rather simple, and can essentially be expressed as "fetch all documents withdoc.seq >= seqand with limitlimitfor the givenseqandlimit". If none of the documents have laterseqs (e.g. in the rare case of a non-winning document being replicated after a winning document), then this query is all we need, and can get done with two simple IDB operations:getAll()andgetAllKeys()(for browsers that support it, currently Chrome and Firefox but Safari is implementing it as well and it's under consideration for Edge).So my technique is to write a separate
changes()implementation that uses the "fast" strategy, and then we fall back to the old "slow" strategy whenever necessary. Running the mapreduce and integration tests, I found that the fast path was hit 6587 times and the slow path was hit 545 times, so 92% of the time we're using the fast option (in Firefox/Chrome anyway).Furthermore this has a huge impact on our perf. Using the
temp-viewstest (which is a good proxy for intense secondary index usage), I got the following improvement (10 iterations):