Skip to content

How is Khmer line-breaking handled on the Web? #4

@r12a

Description

@r12a

The current understanding at W3C is that Khmer text behaves like Thai when lines are wrapped. See http://w3c.github.io/i18n-drafts/articles/typography/linebreak.en#sec_se_asia for a very high-level summary.

The CSS specification deals with line-breaking at https://drafts.csswg.org/css-text-3/#line-break-property. Note particularly the text about Thai that says:

As UAs can add additional distinctions between strict/normal/loose modes, these values can exhibit other differences as well. For example, a UA with sufficiently-advanced Thai language processing ability could choose to map different levels of strictness in Thai line-breaking to these keywords, e.g. disallowing breaks within compound words in strict mode (e.g. breaking ตัวอย่างการเขียนภาษาไทย as ตัวอย่าง·การเขียน·ภาษาไทย) while allowing more breaks in loose (ตัวอย่าง·การ·เขียน·ภาษา·ไทย).

The question for this issue is whether the same applies to Khmer, and whether there are other features of Khmer line breaking that need to be called out in the spec. For example, which of these is true?

  1. You can break text at line end for Khmer between syllables without being concerned about word boundaries.
  2. There is a preference for breaking at word boundaries, but breaking at syllable boundaries is also common.
  3. Text in Khmer should always break at recognisable words at word boundaries.

We are also looking for evidence of current problems related to Khmer line-breaking on the Web/in eBooks.

Advice (especially with examples) would be very much appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    i:line_breakingLine breaking & hyphenationi:segmentationGrapheme/word segmentation & selectionquestionFurther information is requesteds:khmr

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions