fix(input): handle multi-byte UTF-8 characters in comment input (#132)#147
Merged
agavra merged 3 commits intoagavra:mainfrom Jan 27, 2026
Merged
fix(input): handle multi-byte UTF-8 characters in comment input (#132)#147agavra merged 3 commits intoagavra:mainfrom
agavra merged 3 commits intoagavra:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Using multi-byte UTF-8 characters (em dashes, CJK, emoji, etc.) in comment input caused panics. The root cause was that cursor operations used byte indices directly, which breaks when characters span multiple bytes.
For example, the em dash
–is 3 bytes (UTF-8:E2 80 93). When the cursor was at byte position 75 inside this character, operations like backspace would slice the string at an invalid boundary, causing:Solution
Commit 1: UTF-8 aware text editing
Created
src/text_edit.rswith functions that respect character boundaries:prev_char_boundary()/next_char_boundary()- find valid cursor positionsdelete_char_before()/delete_word_before()- safe deletion operationsRefactored
src/handler.rsto use these instead of direct byte manipulation.Commit 2: IME cursor positioning
After fixing the text editing, I discovered a secondary issue: the terminal cursor wasn't being positioned during comment input. This matters for IME users—when composing Korean, Chinese, or Japanese text, the IME composition window appears at the terminal cursor position. Without proper positioning, the composition window would appear at (0,0) instead of where the user is typing.
The fix:
CommentCursorInfostruct to track cursor position during renderingframe.set_cursor_position()to place the terminal cursor correctlyTesting
Fixes #132