Skip to content

fix(input): handle multi-byte UTF-8 characters in comment input (#132)#147

Merged
agavra merged 3 commits intoagavra:mainfrom
Gnob:fix/unicode-input-support
Jan 27, 2026
Merged

fix(input): handle multi-byte UTF-8 characters in comment input (#132)#147
agavra merged 3 commits intoagavra:mainfrom
Gnob:fix/unicode-input-support

Conversation

@Gnob
Copy link
Copy Markdown
Contributor

@Gnob Gnob commented Jan 27, 2026

Problem

Using multi-byte UTF-8 characters (em dashes, CJK, emoji, etc.) in comment input caused panics. The root cause was that cursor operations used byte indices directly, which breaks when characters span multiple bytes.

For example, the em dash is 3 bytes (UTF-8: E2 80 93). When the cursor was at byte position 75 inside this character, operations like backspace would slice the string at an invalid boundary, causing:

byte index 75 is not a char boundary; it is inside '–' (bytes 74..77)

Solution

Commit 1: UTF-8 aware text editing

Created src/text_edit.rs with functions that respect character boundaries:

  • prev_char_boundary() / next_char_boundary() - find valid cursor positions
  • delete_char_before() / delete_word_before() - safe deletion operations

Refactored src/handler.rs to use these instead of direct byte manipulation.

Commit 2: IME cursor positioning

After fixing the text editing, I discovered a secondary issue: the terminal cursor wasn't being positioned during comment input. This matters for IME users—when composing Korean, Chinese, or Japanese text, the IME composition window appears at the terminal cursor position. Without proper positioning, the composition window would appear at (0,0) instead of where the user is typing.

The fix:

  • Added CommentCursorInfo struct to track cursor position during rendering
  • Calculate screen coordinates accounting for scroll offset and line wrapping
  • Call frame.set_cursor_position() to place the terminal cursor correctly

Testing

  • 20 tests for text editing operations (boundary detection, deletion, navigation)
  • 6 tests for cursor position calculation
  • Tested with Korean IME input (e.g., "안녕 아가브라")

Fixes #132

Copy link
Copy Markdown
Owner

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Gnob! I skimmed this and tested it locally, the changes seem good to me. I always appreciate improved international language support.

Thanks for the contribution 🔥

@agavra agavra merged commit 8b5541d into agavra:main Jan 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic when using em dash

2 participants