fix: BLOB hex encoding + full Unicode width tables#190
Merged
Conversation
- Add BLOB hex encoding (X'...') in SQL INSERT output instead of silent NULL conversion (issue in writeSqlRow else branch) - Handle zero-length BLOB quirk: check column_bytes() before column_blob() pointer (null for empty blobs) - Integration tests for BLOB round-trip (174j) and empty BLOB (174k) - Replace 4-range isWideCodepoint() with comprehensive sorted [2]u21 range table + binary search (~80 ranges) - Covers CJK A-H, Hiragana/Katakana, Hangul, Yi, Fullwidth, Emoji (terminal convention, noted) - Fix 3 incorrect expected values in existing visual tests - Add tests for Hiragana, Katakana, and emoji width
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two fixes identified during the v0.15.0→master diff review.
BLOB hex encoding in SQL output
The
writeSqlRow()else branch calledsqlite3_column_text()for BLOB columns, which returns null for non-UTF8 data, causing silent NULL output. Now outputsX'<hex>'format with proper empty-BLOB handling.Full Unicode width tables
Replaced the minimal 4-range
isWideCodepoint()with a comprehensive sorted[2]u21range table (~80 entries) covering all Unicode East Asian Width W/F ranges plus common emoji. Uses binary search for lookups. All 7 unit tests pass.Side fixes
visual.zigtests (were passing incorrect assumptions)