Skip to content

encoding: Add "reopen with encoding"#46553

Merged
ConradIrwin merged 6 commits intozed-industries:mainfrom
tomopumipumi:feature/reopen-with-encoding
Jan 27, 2026
Merged

encoding: Add "reopen with encoding"#46553
ConradIrwin merged 6 commits intozed-industries:mainfrom
tomopumipumi:feature/reopen-with-encoding

Conversation

@tomopumipumi
Copy link
Copy Markdown
Contributor

Add "Reopen with Encoding" feature (Local/Single user)

Summary

This PR adds a "Reopen with Encoding" feature to allow users to manually specify an encoding and reload the active buffer.

This feature allows users to explicitly specify the encoding and reload the file to resolve garbled text caused by incorrect detection.

Changes

  1. Added encoding picker logic to encoding_selector
  • Implemented a modal UI accessible via the command palette, shortcuts, or by clicking the encoding status in the status bar.
  • Allows users to select from a list of supported encodings (Shift JIS, EUC-JP, UTF-16LE, etc.).
  1. Updated Buffer logic (crates/language)
  • Added a force_encoding_on_next_reload flag to the Buffer struct.
  • Updated the reload method to check this flag and apply the following logic:
    • Non-Unicode (e.g., Shift JIS): Bypasses heuristics (like BOM checks) to force the specified encoding.
    • Unicode (e.g., UTF-8): Performs standard BOM detection. This ensures that the BOM is correctly handled/consumed when switching back to UTF-8.
  1. UI / Keymap
  • Made the encoding status in the status bar (ActiveBufferEncoding) clickable.
  • Added default keybindings:
    • macOS: cmd-k n
    • Linux/Windows: ctrl-k n
    • Windows: ctrl-k n

Limitations & Scope

To ensure stability and keep the PR focused, the following scenarios are intentionally out of scope:

  1. Collaboration and Remote Connections
  • Encoding changes are disabled when collaboration (is_shared) or SSH remote connections (is_via_remote_server) are active.
  • Reason: Synchronizing encoding state changes between host/guest or handling remote reloads involves complex synchronization logic. This PR focuses on local files only.

Remote Connection (SSH/WSL)

Via status bar Via shortcut/command
remote_tooltip remote_shortcut

Collaboration Session

Via status bar Via shortcut/command
collab_tooltip collab_pop
  1. Dirty State
  • The feature is disabled if the buffer has unsaved changes to prevent data loss during reload.
Via status bar Via shortcut/command
local_dirty_tooltip local_dirty_pop
  1. Files detected as Binary

Files that worktree detects as "binary" (e.g., UTF-16 files without BOM containing non-ASCII characters) are not opened in the editor, so this feature cannot be triggered.
Future Work: Fixing this would require modifying crates/worktree heuristics or exposing a "Force Open as Text" action for InvalidItemView to trigger. Given the scope and impact, this is deferred to a future PR.

Test Plan

I verified the feature and BOM handling using the following scenarios:

Preparation

Used the following test files:

Used an external editor (VS Code or Notepad) for verification.

Case 1: English-only file behavior

  1. Open an English-only UTF-8 file (test_utf8.txt).
  2. Reopen as Shift JIS.
  3. Result:
  • Text appearance remains unchanged (since ASCII is compatible).
  • Status bar updates to "Shift JIS".

Case 2: Fixing Mojibake

  1. Open a Shift-JIS file (test_shiftjis_jp.txt) that causes detection failure.
    ※Confirm it opens with mojibake
  2. Select Shift JIS from the status bar selector.
  3. Result:
  • Mojibake is resolved, and Japanese text is displayed correctly.
  • Status bar updates to "Shift JIS".

Case 3: Unicode file with BOM behavior

  1. Open an English-only UTF-8 with BOM file (test_utf8_bom.txt).
  2. Reopen as Shift JIS.
  3. Result:
  • The BOM bytes are displayed as mojibake at the beginning of the file.
  • The rest of the English text is displayed normally (ASCII compatibility).
  • Status bar updates to "Shift JIS".

Case 4: Non-Unicode file with BOM behavior

  1. Open a UTF-8 with BOM file containing Japanese (test_utf8_jp_bom.txt).
  2. Reopen as Shift JIS.
  3. Result:
  • The BOM bytes at the start are displayed as mojibake.
  • The Japanese text body is displayed as mojibake (UTF-8 bytes interpreted as Shift JIS).
  • Status bar updates to "Shift JIS" (no BOM indicator).

Case 5: Revert to Unicode

  1. From the state in Case 4 (Shift JIS with mojibake), reopen as UTF-8.
  2. Result:
  • The BOM mojibake at the start disappears (consumed).
  • The text returns to normal.
  • Status bar updates to "UTF-8 (BOM)".

Case 6: External BOM removal (State sync)

  1. Open a UTF-8 with BOM file in Zed (test_utf8_bom.txt).
  2. Open the same file in an external editor and save it as UTF-8 (No BOM).
  3. Refocus Zed.
  4. Result:
  • Text appearance remains unchanged.
  • The (BOM) indicator disappears from the status bar.
  • Saving in Zed and checking externally confirms the BOM is gone.

Case 7: External BOM addition

  1. From the state in Case 6 (UTF-8 No BOM), save as UTF-8 with BOM in the external editor.
  2. Refocus Zed.
  3. Result:
  • The (BOM) indicator appears in the status bar.
  • Saving in Zed and checking externally confirms the BOM is present.

Case 8: External Encoding Change (Auto-detect sync)

  1. Open an English-only UTF-8 file in Zed (test_utf8.txt).
    • Status bar shows: "UTF-8".
  2. Open the same file in an external editor and save it as UTF-16LE with BOM.
  3. Refocus Zed.
  4. Result:
    • The text remains readable (no mojibake).
    • Status bar automatically updates to "UTF-16LE (BOM)". (Verifies that buffer.encoding is correctly updated during reload).

Release Notes:

  • Added "Reopen with Encoding" feature (currently supported for local files).

@cla-bot cla-bot bot added the cla-signed The user has signed the Contributor License Agreement label Jan 11, 2026
@tomopumipumi tomopumipumi force-pushed the feature/reopen-with-encoding branch from 24e737e to e58d9a1 Compare January 11, 2026 08:27
@ConradIrwin
Copy link
Copy Markdown
Member

Nice, thank you!

I am not sure about the API, force_encoding_on_next_reload followed by a reload seems a bit round-about. Can we add reload_with_encoding() instead? (Both reload() and reload_with_encoding() could call the same helper method if you were trying to avoid that).

We also set the encoding before the loaded buffer was loaded with that encoding. I guess this makes the UI update faster, but it introduces a transient state where the encoding doesn't match what was used to load the content. I'm not sure that will ever actually cause problems, but it'd be nice to avoid it.

@tomopumipumi tomopumipumi force-pushed the feature/reopen-with-encoding branch from e58d9a1 to 7bcfae8 Compare January 14, 2026 12:21
@tomopumipumi
Copy link
Copy Markdown
Contributor Author

Thank you for the feedback! I agree that passing the encoding explicitly is much cleaner than managing a state flag.

I have addressed your suggestions in the latest commit:

  • Removed force_encoding_on_next_reload flag.
  • Introduced reload_with_encoding(encoding, cx) public method.
  • Refactored the internal logic into reload_impl.
  • Fixed the transient state issue: buffer.encoding and has_bom are now updated only after the reload completes successfully in the background task.

I have also re-verified the changes against the test plan outlined in the PR description to ensure everything works as expected.

@tomopumipumi
Copy link
Copy Markdown
Contributor Author

Should I close this PR?

@ConradIrwin
Copy link
Copy Markdown
Member

Sorry for the slow response here, and thank you again for this.

I've pushed up support for undo/redo because without it the feature felt a bit broken.

I also think we need to figure out the story for remote development (at least the devcontainer/ssh variety) to really call this feature done. Not sure if that's of interest to you, but shouldn't be too hard (famous last words) – we'll need a way to pass the buffer encoding from the remote down, and then a way to request a reload.

You said in the description that in response to the file on disk changing, zed reloads the file with the new encoding; I didn't see this happening, it kept the existing encoding (though TBH this felt fine as a user – and I can of course use reopen with encoding if I need).

@ConradIrwin ConradIrwin enabled auto-merge (squash) January 27, 2026 05:24
@ConradIrwin ConradIrwin merged commit 8e291ec into zed-industries:main Jan 27, 2026
27 checks passed
@tomopumipumi
Copy link
Copy Markdown
Contributor Author

Thanks again for the review and merging this!

I completely agree that the remote development story is the missing piece to make this feature truly complete. The path forward seems clear (piping the encoding info through RPC), but I realize this is a step up in complexity compared to the local version and will require more rigorous verification.

I'll give it a shot!That said, since it might take me some time to get the wiring right, if a better implementation comes along from someone else in the meantime (perhaps while I'm stuck in a rabbit hole!), please don't hesitate to move forward with that one lol.

@tomopumipumi tomopumipumi deleted the feature/reopen-with-encoding branch January 27, 2026 12:21
@ConradIrwin
Copy link
Copy Markdown
Member

Thank you! If you have acres to Claude code it can probably make you an end to end test with the other remote integration tests (or you can copy that pattern).

The other tip is to ssh to localhost to avoid too much time wasted, but iteration cycles can be slow.
Happy to pair on it if you’d like: https://cal.com/conradirwin/pairing

@tomopumipumi
Copy link
Copy Markdown
Contributor Author

Wow, thank you so much for the incredibly generous offer to pair! I am honored.

To be honest, while my async English I/O works fine, my real-time audio processing has high latency and tends to drop packets. So a live pairing session might be a bit challenging for me right now!

I'll start by trying your suggestions ("ssh to localhost" / Claude Code) to navigate the "rabbit hole" at my own pace. I'll reach out on GitHub if I hit a wall.

Thanks again for your support!

naaiyy added a commit to Glass-HQ/Glass that referenced this pull request Feb 16, 2026
Key changes:
- Reopen with encoding (zed-industries#46553) - new encoding selector with reopen support
- Relative line jumps in go-to-line (zed-industries#46932)
- Terminal tab renaming (zed-industries#45800)
- Project search spinner while search underway (zed-industries#47620)
- Git: retain "since" diffs, avoid unwrap in panel, don't rebuild diff on repo change
- Extensions: fix duplicate button IDs preventing uninstall (zed-industries#47745)
- SendKeystrokes: don't use layout key equivalents (zed-industries#47061)
- Collab tests extracted to integration crate (deleted in our fork)
- Various CI, docs, and agent improvements

Conflict resolutions:
- Kept native_button style, adopted extension_button_id() for unique IDs
- Added encoding_selector::init alongside browser::init
- Fixed _cx → cx rename for new encoding selector render code

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed The user has signed the Contributor License Agreement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants