Hyphenation support #17

jonasdiemer · 2025-12-14T14:15:56Z

jonasdiemer
Dec 14, 2025
Collaborator

To avoid big gaps in justified text, hyphenation of words would be beneficial (I would focus on words >7-9 char).

I assume existing hyphenation libraries (e.g. https://github.com/hunspell/hyphen, https://github.com/bramstein/hypher) are too heavy. They could, however, provide hyphenation rules (pattern files), which we could shorten (to the "top 1000"?).

In case we want to implement the algorithm from scratch (because existing implementations are too heavy): https://www.tug.org/docs/liang/

Hypher (https://github.com/typst/hypher) has an interesting approach to implement those patterns as FSM, which seems to be very efficient. We could use that project to generate FSMs, convert them into header or binary files and implement a small runtime to execute them (LLM may be able to do some/most of it).

osteotek · 2025-12-17T15:34:36Z

osteotek
Dec 17, 2025
Collaborator

Tried my hand at simple hyphenation engine, feedback is welcome - #47
Not using any tables or patterns currently, just simple rules based on vowels and diphthongs

0 replies

jonasdiemer · 2025-12-17T16:21:55Z

jonasdiemer
Dec 17, 2025
Collaborator Author

Nice. Looking at the screenshot, I am seeing that you end up with fragments of 3 characters. To my taste, I would increase that to 4 or 5 (making hyphenation only on fairly long words).

0 replies

jlaunay · 2025-12-31T01:36:35Z

jlaunay
Dec 31, 2025

With my other e-readers, I usually use this Calibre plugin , which adds soft hyphens to books that don't already have them.
It's really convenient and might be easier to set up at first than a more complex system.
If support for soft hyphens isn't planned, it would at least be good to make sure they are not displayed, because right now, on these books, I end up with hyphens (dashes) everywhere, for example "par-ti-cu-lier" "up-da-ted" "dis-tri-bu-ting" etc.

6 replies

jonasdiemer Dec 31, 2025
Collaborator Author

@jlaunay do you have a good source for the HyphenateThis! plugin? Seems unmaintained for ages...

jlaunay Dec 31, 2025

The calibre official page, release on oct 2025

jonasdiemer Jan 3, 2026
Collaborator Author

With 0.12.0, we should now be ignoring soft-hyphens (i.e. avoid displaying them).

osteotek Jan 7, 2026
Collaborator

added soft hyphen support to my hyphenation PR - #47
i.e. soft hyphens should be used for breaking words and not displayed. I've tested it briefly, but if anyone can test it more, that would be great
Adding soft hyphens with Calibre might be best option for hyphenation for now, implementing proper hyphenation engine to run on device turned out to be way more complicated than I imagined

jlaunay Jan 7, 2026

added soft hyphen support to my hyphenation PR - #47 i.e. soft hyphens should be used for breaking words and not displayed. I've tested it briefly, but if anyone can test it more, that would be great Adding soft hyphens with Calibre might be best option for hyphenation for now, implementing proper hyphenation engine to run on device turned out to be way more complicated than I imagined

I'm French, using stock English/Russian it almost it's almost working (I think because of Latin English).
Using soft hyphens within calibre it works perfectly, I have to test more but with the book I tried it was perfect, thanks 🙏

osteotek · 2026-01-09T18:51:29Z

osteotek
Jan 9, 2026
Collaborator

Made another attempt on Hyphenation, using rule dictionaries - #305
@jonasdiemer thanks for the suggestion to look at Hypher

0 replies

osteotek · 2026-01-19T18:11:10Z

osteotek
Jan 19, 2026
Collaborator

#305 has been merged. Please provide feedback on the hyphenation feature. Also, do you think we need to support more languages (besides included English, German, French and Russian)?

3 replies

Uri-Tauber Jan 19, 2026

@osteotek This looks great—thanks!

One technical suggestion: consider adding a configuration option to enable hyphenation on a per-language basis. For example, I read mostly in English and don’t expect to read in other languages (German, French, etc.). Limiting hyphenation to just English at runtime could help reduce RAM usage and CPU overhead.

What do you think?

osteotek Jan 19, 2026
Collaborator

Additional language dictionaries do not consume RAM, only flash. There is no overhead in having other languages

Uri-Tauber Jan 19, 2026

Good to know

daveallie · 2026-02-19T14:45:19Z

daveallie
Feb 19, 2026
Maintainer

Considering this one closed, future issues / feature requests around hyphenation can be made through Issues

0 replies

Uh oh!

Hyphenation support #17

Uh oh!

jonasdiemer Dec 14, 2025 Collaborator

Replies: 6 comments · 9 replies

Uh oh!

osteotek Dec 17, 2025 Collaborator

Uh oh!

jonasdiemer Dec 17, 2025 Collaborator Author

Uh oh!

jlaunay Dec 31, 2025

Uh oh!

jonasdiemer Dec 31, 2025 Collaborator Author

Uh oh!

jlaunay Dec 31, 2025

Uh oh!

jonasdiemer Jan 3, 2026 Collaborator Author

Uh oh!

Uh oh!

osteotek Jan 7, 2026 Collaborator

Uh oh!

jlaunay Jan 7, 2026

Uh oh!

osteotek Jan 9, 2026 Collaborator

Uh oh!

Uh oh!

osteotek Jan 19, 2026 Collaborator

Uh oh!

Uri-Tauber Jan 19, 2026

Uh oh!

Uh oh!

osteotek Jan 19, 2026 Collaborator

Uh oh!

Uri-Tauber Jan 19, 2026

Uh oh!

daveallie Feb 19, 2026 Maintainer

jonasdiemer
Dec 14, 2025
Collaborator

Replies: 6 comments 9 replies

osteotek
Dec 17, 2025
Collaborator

jonasdiemer
Dec 17, 2025
Collaborator Author

jlaunay
Dec 31, 2025

jonasdiemer Dec 31, 2025
Collaborator Author

jonasdiemer Jan 3, 2026
Collaborator Author

osteotek Jan 7, 2026
Collaborator

osteotek
Jan 9, 2026
Collaborator

osteotek
Jan 19, 2026
Collaborator

osteotek Jan 19, 2026
Collaborator

daveallie
Feb 19, 2026
Maintainer