Normalize filesystem operations when syncing from memory fs to OPFS#2252
Normalize filesystem operations when syncing from memory fs to OPFS#2252
Conversation
…ps to a single WRITE
…single WRITE (file)
There was a problem hiding this comment.
Pull Request Overview
This PR normalizes filesystem operations when syncing from the memory fs to OPFS by leveraging an existing normalization function and aligning behavior through updated tests and logic.
- The flushJournal function now applies normalizeFilesystemOperations to optimize processing.
- Comprehensive tests have been added to cover various scenarios involving CREATE, WRITE, RENAME, and DELETE operations.
- The normalization logic in fs-journal.ts has been refined to handle DELETE redundancies more explicitly.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/php-wasm/web/src/lib/directory-handle-mount.ts | Updates flushJournal to use normalized journal entries for OPFS syncing and includes concurrency considerations. |
| packages/php-wasm/fs-journal/src/test/fs-journal.spec.ts | Adds tests ensuring normalizeFilesystemOperations covers multiple file operation scenarios. |
| packages/php-wasm/fs-journal/src/lib/fs-journal.ts | Adjusts normalization logic and comments to clarify DELETE redundancy handling following a CREATE/WRITE operation. |
Comments suppressed due to low confidence (1)
packages/php-wasm/fs-journal/src/lib/fs-journal.ts:457
- Consider clarifying in the comment that the DELETE operation is considered redundant only for nodes created in the current journal, ensuring the intent is clear.
if (former.operation === 'CREATE') {
Co-authored-by: Copilot <[email protected]>
| // Concurrency safety note | ||
| // As I understand it, journal is specific to a PHP instance, | ||
| // so it's not possible to have concurrency push of entries to journal | ||
| // But this can change in future so it doesn't hurt to read from journal | ||
| // in a concurrent safe way, which is what we are doing here. | ||
|
|
||
| // We first copy it to a new array | ||
| const journalEntries = [...journal]; | ||
| // and then only delete however many entries we were able to grab | ||
| // since with concurrent writes there could have been more insertions | ||
| journal.splice(0, journalEntries.length); |
There was a problem hiding this comment.
Good call! Let's just talk about asynchronous writers, not concurrent writers. I got confused for a second and wrote a big reply about how JS is single-threaded, and only then I realized how this change works 🙈
There was a problem hiding this comment.
@adamziel I indeed meant concurrent writes in this context since web workers offer true concurrency, so I believe my comment reflects the intent accurately. I will wait for you to respond before merging this PR :)
There was a problem hiding this comment.
Aha! journal is a regular variable and it can only be accessed and modified from the same thread. We should never see conflicting unshift() calls.
As for concurrent filesystem writes - I don't think the journaling approach will generalize to concurrent rw access.
We'd need a live two-way sync between the source of truth (e.g. opfs) and memfs. It's possible, but the complexity would be really bad. We'd effectively turn Playground into a distributed system. I'd rather integrate OPFS directly via e dedicated Emscripten filesystem backen, in which case there would be no journal, or implement a SharedArrayBuffer MEMFS with a single OPFS writer.
There was a problem hiding this comment.
Gotcha! I will leave the comment as is then & merge it.
This PR supersedes https://github.com/Automattic/wordpress-playground-private/pull/126
Motivation for the change, related issues
When using OPFS, the memory fs sync to OPFS doesn't normalize filesystem operations, even though there is a normalizing file system operations function already available. In a previous attempt, a new journal compression function was added that followed the following rules but partitioned around RENAME operations:
While reconciling the logic of both functions, I discovered the already existing function handles pretty much all rules that were implemented (and some more), so in this PR, I have extended the tests to ensure the expected behavior out of normalizeFilesystemOperations()
A test case was found which was failing and the fix for that in the logic was implemented.
And now OPFSRewriter normalizes fs operations before executing them to keep OPFS in sync.
Improvements in journal size that's processed (screenshot from earlier PR):