Treat U+3000 as CJK punctuation-equivalent and make sure to recognize U+FF5E as CJK punctuation again in Markdown prose wrap#18656
Conversation
commit: |
✅ Deploy Preview for prettier ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Can you post an example, what will change? If I'm not wrong, this will effect how sentence splits? In my experience, at least in Chinese, no one use U+3000 as punctuation, it's only used in one place, which is make alignment. |
Right, but if the line is ended with it, isn't it superfluous that an extra ASCII space is inserted after it when they are aggregated into one line? Markdown初学者写的第一个Paragraph
Markdown初学者写的下一个段落。↓ ❌️ Current Firefox's rule: biomejs/biome#7304 (comment) |
So, this change means, only if u+3000 at end, it will not insert space when merge with next line? (I'm really sorry that I don't know enough about our markdown printer) |
Currently (without #16805), the writer must prefer to add a space between a han and an alphanumeric (applies to many Chinese and some Japanese; i.e. Prettier 2.x before #11597) to notify the change. Prettier (without #16805) will not insert a space even when it detects that style in (probably) a paragraph. This change affects on the condition where
This change (for U+3000) only applies to the patterns when they are tried to be squashed into one line when Prettier detects the pre-#11597 style:
They do not affect:
|
| await require('prettier').format('测试 Test テスト Test\nU+3000\u{3000}\nU+301C\u{301C}\nU+FF5E\u{FF5E}\nU+1F221\u{1F221}\n', {parser: 'markdown', proseWrap: 'always'}); | ||
|
|
||
| # Prettier stable | ||
| // -> '测试 Test テスト Test\nU+3000\u{3000} \nU+301C\u{301C}\nU+FF5E\u{FF5E} \nU+1F221\u{1F221}\n' |
There was a problem hiding this comment.
| // -> '测试 Test テスト Test\nU+3000\u{3000} \nU+301C\u{301C}\nU+FF5E\u{FF5E} \nU+1F221\u{1F221}\n' | |
| // -> '测试 Test テスト Test\nU+3000\u{3000} \nU+301C\u{301C}\nU+FF5E\u{FF5E} \nU+1F221\u{1F221}\n' | |
| // ^ extra space |
There was a problem hiding this comment.
Wait, this is not the real output, please actually run the code above and copy the print.
Since U+3000 is not visible, you can use this script
const output = await require('prettier').format('测试 Test テスト Test\nU+3000\u{3000}\nU+301C\u{301C}\nU+FF5E\u{FF5E}\nU+1F221\u{1F221}\n', {parser: 'markdown', proseWrap: 'always'});
[...output.slice(23, 26)].map(character => `\\u${character.codePointAt(0).toString(16)}`);
Other than "测试" is https://github.com/yuru7/udev-gothic (Japanese font). The dotted squares are U+3000. |
|
Do we need |
No, since it's pure JS code. |
| ```js | ||
| await prettier.format( | ||
| "测试 Test テスト Test\nU+3000\u{3000}\nU+301C\u{301C}\nU+FF5E\u{FF5E}\nU+1F221\u{1F221}\n", | ||
| await require("prettier").format( |
There was a problem hiding this comment.
Use prettier instead of require("prettier").
There was a problem hiding this comment.
Done, but you will have to run prettier = require("prettier").default in advance.
|
Thank you as always. |


Description
Cherry-picked from #16805 (U+FF5E)
I noticed that U+FF5E ~ is not treated as a punctuation unlike U+301C 〜 (both are used in the same way in Japanese) in current Prettier by a failing test there. I should have fixed it in #16832, but a regression seems to have occurred.
https://prettier.io/playground/#N4Igxg9gdgLgprEAuEhpW0KvRACAKnAzjJoGMMgnQyATDDvjADpQCqA1AMwAMAjAMKA4DLYwGJ8ArAFFAev+8GbPgCZpbQDwbgQj3aIADQgIABxgBLaHmSgAhgCcTEAO4AFUwgMojAGwtGAngfUAjE0bABrOBgAZSMAWzgAGR0oOGQAMyc8OHUITwArODAYAHUfTWQQTRN8OBMAN1ivH39AoM1faIBzZBgTAFdkkCTQnRb2zrgAD01SnXDYJwB5EZ8YCBMrCDwdXWgChAATNRAhmbGEGCdcEyhTHXx4xM7lqEbHOABFNoh4S8ck9TS8QaCm+6eXrEkAl3p0AI7PeBWcyaewgIx4AC0MTgG1R21aRh0jiaHAgoVCRgKTkc2xudzgAEEYK0dJ42lDSlEYm8PiAABYwUKObLslb4epgOBBOwrHRlFauApgPAeEBlDoASSgaNgQTAJh02kpKqCMFc91ZnSKSzguSM+Qczjccs2kzizKBIEccW20SSJhg0KMjUJRvU9RMHoKhJMfg2lig2yK0RyOg2MHZyAAHCx1MUITpit7fUTgVd1IdPNl44nkNJ1G0ktgjJ57CC2XBQp5UWiNhEjLc2j64Hx5oSaU1iQyICAAL5joA
Recent FIrefox added U+3000 to the CJ(K) punctuation list. I will add it to Prettier as a bonus, too.
Firefox preview for HTML (
jaorzhis mandatory to trim spaces around these characters):Checklist
docs/directory).changelog_unreleased/*/XXXX.mdfile followingchangelog_unreleased/TEMPLATE.md.