Skip to content

Normalize filesystem operations when syncing from memory fs to OPFS#2252

Merged
ashfame merged 9 commits intotrunkfrom
journal_compression_opfs
Jun 16, 2025
Merged

Normalize filesystem operations when syncing from memory fs to OPFS#2252
ashfame merged 9 commits intotrunkfrom
journal_compression_opfs

Conversation

@ashfame
Copy link
Member

@ashfame ashfame commented Jun 11, 2025

This PR supersedes https://github.com/Automattic/wordpress-playground-private/pull/126

Motivation for the change, related issues

When using OPFS, the memory fs sync to OPFS doesn't normalize filesystem operations, even though there is a normalizing file system operations function already available. In a previous attempt, a new journal compression function was added that followed the following rules but partitioned around RENAME operations:

  • Multiple WRITE operations of the same file (at end of request, writing once is enough)
  • CREATE operation before WRITE operation (no need to CREATE separately)
  • DELETE operation after CREATE/WRITE operation (no need to handle CREATE/WRITE in these cases)

While reconciling the logic of both functions, I discovered the already existing function handles pretty much all rules that were implemented (and some more), so in this PR, I have extended the tests to ensure the expected behavior out of normalizeFilesystemOperations()

A test case was found which was failing and the fix for that in the logic was implemented.
And now OPFSRewriter normalizes fs operations before executing them to keep OPFS in sync.

Improvements in journal size that's processed (screenshot from earlier PR):

438687809-b4c6c627-7742-41c1-8e82-d9f2d06c9248

@ashfame ashfame requested a review from adamziel June 11, 2025 14:12
@ashfame ashfame marked this pull request as ready for review June 11, 2025 14:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR normalizes filesystem operations when syncing from the memory fs to OPFS by leveraging an existing normalization function and aligning behavior through updated tests and logic.

  • The flushJournal function now applies normalizeFilesystemOperations to optimize processing.
  • Comprehensive tests have been added to cover various scenarios involving CREATE, WRITE, RENAME, and DELETE operations.
  • The normalization logic in fs-journal.ts has been refined to handle DELETE redundancies more explicitly.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
packages/php-wasm/web/src/lib/directory-handle-mount.ts Updates flushJournal to use normalized journal entries for OPFS syncing and includes concurrency considerations.
packages/php-wasm/fs-journal/src/test/fs-journal.spec.ts Adds tests ensuring normalizeFilesystemOperations covers multiple file operation scenarios.
packages/php-wasm/fs-journal/src/lib/fs-journal.ts Adjusts normalization logic and comments to clarify DELETE redundancy handling following a CREATE/WRITE operation.
Comments suppressed due to low confidence (1)

packages/php-wasm/fs-journal/src/lib/fs-journal.ts:457

  • Consider clarifying in the comment that the DELETE operation is considered redundant only for nodes created in the current journal, ensuring the intent is clear.
if (former.operation === 'CREATE') {

Comment on lines +292 to +302
// Concurrency safety note
// As I understand it, journal is specific to a PHP instance,
// so it's not possible to have concurrency push of entries to journal
// But this can change in future so it doesn't hurt to read from journal
// in a concurrent safe way, which is what we are doing here.

// We first copy it to a new array
const journalEntries = [...journal];
// and then only delete however many entries we were able to grab
// since with concurrent writes there could have been more insertions
journal.splice(0, journalEntries.length);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Let's just talk about asynchronous writers, not concurrent writers. I got confused for a second and wrote a big reply about how JS is single-threaded, and only then I realized how this change works 🙈

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel I indeed meant concurrent writes in this context since web workers offer true concurrency, so I believe my comment reflects the intent accurately. I will wait for you to respond before merging this PR :)

Copy link
Collaborator

@adamziel adamziel Jun 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! journal is a regular variable and it can only be accessed and modified from the same thread. We should never see conflicting unshift() calls.

As for concurrent filesystem writes - I don't think the journaling approach will generalize to concurrent rw access.

We'd need a live two-way sync between the source of truth (e.g. opfs) and memfs. It's possible, but the complexity would be really bad. We'd effectively turn Playground into a distributed system. I'd rather integrate OPFS directly via e dedicated Emscripten filesystem backen, in which case there would be no journal, or implement a SharedArrayBuffer MEMFS with a single OPFS writer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! I will leave the comment as is then & merge it.

@github-project-automation github-project-automation bot moved this from Inbox to Reviewed in Playground Board Jun 13, 2025
@ashfame ashfame merged commit 8f01a29 into trunk Jun 16, 2025
24 checks passed
@ashfame ashfame deleted the journal_compression_opfs branch June 16, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Reviewed

Development

Successfully merging this pull request may close these issues.

2 participants