Skip to content

Comments

IncrementalIDB - MegaChunking#874

Merged
techfort merged 11 commits intotechfort:masterfrom
Nozbe:megachunking
Jan 22, 2021
Merged

IncrementalIDB - MegaChunking#874
techfort merged 11 commits intotechfort:masterfrom
Nozbe:megachunking

Conversation

@radex
Copy link
Contributor

@radex radex commented Dec 10, 2020

(WIP, please don't review the code yet)

Yes, it's me again, with yet another pull request full of strange, complicated code — and another promise that it's worth it for performance 🙃

I think I'm approaching the limits of what IndexedDB can do performance-wise, but it's important for my use case to squeeze all that's possible out of it ;)

TL;DR: It loads the database 22% faster ;)

I made a picture to explain the problem that this PR is trying to solve:

Captura de pantalla 2020-12-10 a las 10 14 35

IndexedDB is implemented (in all browsers as far as I can tell, but certainly in Chrome and Safari) with a multi-process architecture, and the cross-process communication is not very efficient. This can be seen above - waiting for IDB to fetch data from disk takes relatively little time, and most of the time is spent waiting for the XPC dance to complete transferring data -- and clearly, it's not very well tuned, as the CPU usage in the browser process is very low.

So the goal is to:

  • layer enough work at the same time that CPU utilization stays high
  • reduce the initial wait for IDB when no work happens on main thread
  • take better advantage of the concurrency opportunity, and try to keep both main/browser and IDB processes busy at the same time.

This is what I achieved:

Captura de pantalla 2020-12-10 a las 9 59 41

This achieves 22% improvement on my benchmark, and likely more free performance for apps that didn't opt to manually tune IncrementalIDB by supplying serializeChunk/deserializeChunk.

Instead of calling IDBObjectStore.getAll(), I'm fetching multiple megachunks (chunks of chunks 🙃) - currently 20 requests using adjacent IDBKeyRanges. AFAICT, the IDB process in both Safari and Chrome does the first phase (actual disk/db work) sequentially, so there's no win here, but the XPC is more efficient for some reason. I guess since the IDB process sends more messages to browser process, there are fewer gaps in processing them on browser side, so CPU utilization stays higher.

In a further improvement (I call this megachunk interleaving), I only request first half of the megachunks initially, and then in in onsuccess of each one I request the (i+n/2)th chunk. This reduces the initial wait for IDB to almost nothing, and improves concurrency, as the IDB process is kept busy while JS is processing the first half of its work. (I also moved most of the chunk processing - JSON.parse and optional deserializeChunk from the end of the process much earlier - to each megachunk's onSuccess, so that main and IDB processes can be kept busy at the same time… I think this should also improve GC pressure a little bit, but I haven't yet figured out a good technique for measuring that, since it's very noisy)

I'm almost out of ideas for further improvements for now, and the law of diminishing returns is catching up to me, so it'll probably the last PR in the series for a while...

PS. In case you were wondering about using IDBCursor to maximize concurrency opportunity — I tried that multiple times, and it doesn't work. I tried interleaving multiple IDBCursors, and I got to nearly the same performance as interleaved megachunking, but still slower. There are just too many useless pauses on main thread...

@radex radex marked this pull request as ready for review January 20, 2021 10:23
@radex radex changed the title [WIP] IncrementalIDB - MegaChunking IncrementalIDB - MegaChunking Jan 20, 2021
@radex
Copy link
Contributor Author

radex commented Jan 20, 2021

@techfort We've been running this internally for a while, found no issues so far

@techfort
Copy link
Owner

@radex this looks fantastic, i'm merging and sometime today i'll get round to doing a new release. I should really automate this release crap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants