fix: Allow line break after ellipsis and underscore#425
fix: Allow line break after ellipsis and underscore#425daveallie merged 6 commits intocrosspoint-reader:masterfrom
Conversation
…or explicit hyphen detection
|
so there is no space between words, like "matter...without" ? If so I guess we can make case for breaking after ellipsis. |
Correct… no spaces padding ellipses is (sadly) common in ebooks. Ideally there should be a regular space (my preference) or a zero-width space immediately following but publishers don't necessarily get this right. This one is (in my view) a definite yes.
Slashes is the tough one. It's generally allowed in constructions like "and/or" (especially with narrow column widths like we have on x4) and encouraged in typesetting URLs and file paths. The main tradeoff is dates (12/25/2004) and fractions (2/3) which sophisticated engines can be smarter about 🤷. I have no strong preference and will remove from the PR if that's the consensus view. Underscores basically never come up in prose and can be helpful to break when they're in code snippets, URLs, etc. I'm not attached to this, but continue to recommend. |
|
Removed allowing break after slash (solidus) to protect dates |
|
@lukestein how about we set it to ellipsis and underscores only, and merge it |
Totally fine with me! I imagine Dave (whom I'll refrain from tagging) is slammed with real day-job work plus a ton of big crosspoint PRs, but I'm hoping that by keeping my PRs small and legible and engaging with trusted folks like @osteotek I can get the merged. |
…r#425) ## Summary * Add additional punctuation marks to the list of characters that can be immediately followed by a line break even where there is no explicit space ## Additional Context * Huge appreciation to @osteotek for his amazing work on hyphenation. Reading on the device is so much better now. * I am getting bad line breaks when ellipses (…) are between words and book file does not explicitly include some kind of breaking space. * Per [discussion](crosspoint-reader#305 (comment)), several new characters are added in this PR to the `isExplicitHyphen` list to allow line breaks immediately after them: Character | Unicode | Usage | Why include it? -- | -- | -- | -- Solidus (Slash) | U+002F | / | Essential for breaking URLs and "and/or" constructs. Backslash | U+005C | \ | Critical for technical text, file paths, and coding documentation. Underscore | U+005F | _ | Prevents "runaway" line lengths in usernames or code snippets. Middle Dot | U+00B7 | · | Acts as a semantic separator in dictionaries or stylistic lists. Ellipsis | U+2026 | … | Prevents justification failure when dialogue lacks following spaces. Midline Horizontal Ellipsis | U+22EF | ⋯ | Useful for mathematical sequences and technical notation. ### Example: This shows an example of what line breaking looks like *with* this PR. Note the line break after "matter…" (which would not previously have been allowed). It's particularly important here because the book includes non-breaking spaces in "Mr. Aldrich" and "Mr. Rockefeller."  --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? **PARTIALLY**
…r#425) ## Summary * Add additional punctuation marks to the list of characters that can be immediately followed by a line break even where there is no explicit space ## Additional Context * Huge appreciation to @osteotek for his amazing work on hyphenation. Reading on the device is so much better now. * I am getting bad line breaks when ellipses (…) are between words and book file does not explicitly include some kind of breaking space. * Per [discussion](crosspoint-reader#305 (comment)), several new characters are added in this PR to the `isExplicitHyphen` list to allow line breaks immediately after them: Character | Unicode | Usage | Why include it? -- | -- | -- | -- Solidus (Slash) | U+002F | / | Essential for breaking URLs and "and/or" constructs. Backslash | U+005C | \ | Critical for technical text, file paths, and coding documentation. Underscore | U+005F | _ | Prevents "runaway" line lengths in usernames or code snippets. Middle Dot | U+00B7 | · | Acts as a semantic separator in dictionaries or stylistic lists. Ellipsis | U+2026 | … | Prevents justification failure when dialogue lacks following spaces. Midline Horizontal Ellipsis | U+22EF | ⋯ | Useful for mathematical sequences and technical notation. ### Example: This shows an example of what line breaking looks like *with* this PR. Note the line break after "matter…" (which would not previously have been allowed). It's particularly important here because the book includes non-breaking spaces in "Mr. Aldrich" and "Mr. Rockefeller."  --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? **PARTIALLY**
Summary
Additional Context
isExplicitHyphenlist to allow line breaks immediately after them:Example:
This shows an example of what line breaking looks like with this PR. Note the line break after "matter…" (which would not previously have been allowed). It's particularly important here because the book includes non-breaking spaces in "Mr. Aldrich" and "Mr. Rockefeller."
AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? PARTIALLY