Characters Package #53381

justinmc · 2020-03-27T00:32:19Z

Description

Flutter currently has poor support for grapheme clusters like 👨‍👩‍👦, which it considers to be multiple characters even though it appears to the user as one. This PR is a proposal for adding support via the characters package.

The design doc has more details.

Changes

Character count.
Error state based on character count.
Pasting excessive characters places the cursor correctly.

Related Issues

Engine PR (independent):
flutter/engine#17420

Closes #32240
Closes #55670
Closes #54240

Tests

Character count for maxLength works with/without grapheme clusters and surrogate pairs.
Enforced character limit for maxLength works with/without grapheme clusters and surrogate pairs.
Error state for maxLength shows at the right length with/without grapheme clusters and surrogate pairs.

Breaking Change

Currently I don't think this is a breaking change. There may be broken visual diff tests in the rare case where they include a grapheme cluster.

justinmc · 2020-04-22T18:03:43Z

TODO: Handle arrow key navigation in the framework using characters: https://github.com/flutter/flutter/blob/master/packages/flutter/lib/src/rendering/editable.dart#L556

justinmc · 2020-06-04T16:57:09Z

packages/flutter/lib/src/rendering/editable.dart

Is there any performance concern with nextCharacter and previousCharacter? They iterate the entire string in order to avoid bugs where the extent is in the middle of an extended grapheme cluster. If we can assume that won't happen, or we are ok with these methods not finding the end/beginning of the grapheme cluster when it does happen, then we can optimize by starting our search from extent.

If we kept track a flag around that indicated if the string contained anything aside from UTF-16 characters, then we could opt to use simple algorithms.

For strings that contained surrogate pairs or extended grapheme clusters: if we could scan backwards from an arbitrary index and discover if we were in the middle of such a character, then finding the next character (or previous character) index would be cheap. So long as there was a small limit to how far we had to scan backwards.

I wrote a proof-of-concept and it seemed like something like this would work and would be potentially much faster. I'll follow up after this is merged.

justinmc · 2020-06-08T23:08:57Z

packages/flutter/lib/src/painting/text_painter.dart

It's not possible to improve performance using the characters package as expected. It can find characters by iterating from the start or end of a string, but at some offset into a string, it doesn't know where the character boundaries are. Performance is likely to be worse iterating from the start/end of a long string rather than using the existing logarithmic algorithm.

justinmc · 2020-06-09T16:47:32Z

I'm not sure if the failures are due to something special I'm missing because I added a new package, or if it's the typical versioning failure. I'm going to squash all the commits and see if it goes away.

justinmc · 2020-06-09T22:40:58Z

I could use advice on the test failure if anyone has any ideas. Stack frames parsed incorrectly https://github.com/flutter/flutter/pull/53381/checks?check_run_id=755670351

justinmc · 2020-06-11T15:17:47Z

I'm going to split this into 2 PRs, so I'm putting this back into WIP. The other PR will simply add the characters package to the repo, and I'll update this one once it's merged.

… up-to-date master

justinmc · 2020-06-15T18:46:40Z

#59267 was merged and I updated this PR, so we're ready for review again.

HansMuller · 2020-06-15T21:59:07Z

packages/flutter/lib/src/painting/text_painter.dart

  // checks if the value represents a UTF16 glyph by itself or is a 'surrogate'.
-  bool _isUtf16Surrogate(int value) {
+  static bool _isUtf16Surrogate(int value) {
    return value & 0xF800 == 0xD800;


Is this really sufficient to detect a half of a UTF-16 surrogate pair? According to Wikipedia, the first byte of surrogate pair values fall into one of two ranges:

high surrogates (0xD800–0xDBFF), low surrogates (0xDC00–0xDFFF

Not sure if this test is sufficient? Note also: I realize that this code hasn't really changed.

I've opened a separate issue for this and assigned myself. I'll look into it and merge a small PR separately if it is indeed wrong. #59513

This code does detect surrogates values among UTF-16 code units.

The documentation talks about "two or more UTF-16 codeunits", which is confusing because glyphs can indeed be comprised of more than one code point, but that's unrelated to surrogates.
I'd say that the documentation is almost entirely wrong, but the function does what the name says it does.

Thanks for the clarification. I'll update the documentation on #59513.

HansMuller · 2020-06-15T22:09:10Z

packages/flutter/lib/src/rendering/editable.dart

+  /// characters.
+  @visibleForTesting
+  static int nextCharacter(int extent, String string, [bool includeWhitespace = true]) {
+    if (extent >= string.length) {


Presumably string.length is assumed to be beyond the end of the string, even though string.characters.length might have a smaller value than string.length: what are callers supposed to do with the return value here? In other words, what return-value is expected if the next character past extent is beyond the end of the string?

We should have tests for nextCharacter(s.length, s), nextCharacter(s.length -1, s), previousCharacter(0, s), previousCharacter(s.length, s). Or maybe invalid extents should cause an assert?

I've made this more straightforward in b3cdde2.

I assert that the input must be between 0 and string.length, inclusive.

Asking for the previous character from index 0 returns 0. Asking for the next character after string.length returns string.length.

I clarified that the input and output of this method are all positions in the string (not character indices).

I tested all missing edge cases, including cases that assert.

HansMuller · 2020-06-15T22:55:55Z

packages/flutter/lib/src/rendering/editable.dart

If we kept track a flag around that indicated if the string contained anything aside from UTF-16 characters, then we could opt to use simple algorithms.

For strings that contained surrogate pairs or extended grapheme clusters: if we could scan backwards from an arbitrary index and discover if we were in the middle of such a character, then finding the next character (or previous character) index would be cheap. So long as there was a small limit to how far we had to scan backwards.

HansMuller · 2020-06-15T22:56:10Z

packages/flutter/lib/src/painting/text_painter.dart

HansMuller · 2020-06-15T23:01:19Z

packages/flutter/lib/src/rendering/editable.dart

+  /// Setting includeWhitespace to false will only return the index of non-space
+  /// characters.
+  @visibleForTesting
+  static int previousCharacter(int extent, String string, [bool includeWhitespace = true]) {


This method and the nextCharacter method should make meaning of their extent parameter and return value really clear, i.e. are they string indices or string.character indices. They should make how they handle edge cases clear as well.

HansMuller

LGTM

This reverts commit e0ed12c.

lrhn · 2020-06-17T15:06:38Z

packages/flutter/lib/src/rendering/editable.dart

+      if (includeWhitespace) {
+        return false;
+      }
+      return _isWhitespace(currentString.characters.first.toString().codeUnitAt(0));


That line is equivalent to:

return _isWhitespace(currentString.codeUnitAt(0));

Ah thanks, good call!

This will be fixed when I reland this PR in #59778.

This reverts commit e0ed12c.

This reverts commit a99d146.

This reverts commit e0ed12c.

justinmc added a: text input Entering text in a text field or keyboard related problems framework flutter/packages/flutter repository. See also f: labels. work in progress; do not review labels Mar 27, 2020

justinmc self-assigned this Mar 27, 2020

fluttergithubbot added f: material design flutter/packages/flutter/material repository. c: contributor-productivity Team-specific productivity, code health, technical debt. labels Mar 27, 2020

googlebot added the cla: yes label Mar 27, 2020

This was referenced Mar 31, 2020

BlacklistingTextInputFormatter doesnt work with certain emojis #49903

Closed

TextField input emoji of "shy",the cursor jump to the start of text #29841

Closed

This was referenced Apr 22, 2020

Handle surrogate pairs in RenderEditable #55246

Merged

Cursor always at the front when first typing some emojis #50563

Closed

InMatrix mentioned this pull request Apr 24, 2020

Use the characters package in framework documentation and code samples #55598

Closed

6 tasks

This was referenced Apr 28, 2020

When EditText has emoji, the maximum length of editable.dart will be abnormal length and cursor abnormality #55670

Closed

[add] add emoji_InputFormatter support our developer use emoji #55821

Closed

justinmc commented Jun 4, 2020

View reviewed changes

justinmc mentioned this pull request Jun 4, 2020

Can position cursor in "between" single emoji character #13404

Closed

justinmc commented Jun 8, 2020

View reviewed changes

justinmc removed the work in progress; do not review label Jun 8, 2020

justinmc changed the title ~~WIP Characters Proposal~~ Characters Proposal Jun 8, 2020

justinmc requested review from GaryQian and HansMuller June 8, 2020 23:19

justinmc force-pushed the characters branch 2 times, most recently from d5c2db6 to 470569c Compare June 9, 2020 19:51

justinmc mentioned this pull request Jun 9, 2020

WIP Characters upgrade-packages debugging #59098

Closed

justinmc changed the title ~~Characters Proposal~~ Characters Package Jun 9, 2020

fluttergithubbot added the tool Affects the "flutter" command-line tool. See also t: labels. label Jun 9, 2020

justinmc changed the title ~~Characters Package~~ WIP Characters Package Jun 11, 2020

justinmc added the work in progress; do not review label Jun 11, 2020

justinmc mentioned this pull request Jun 11, 2020

Characters package #59267

Merged

justinmc force-pushed the characters branch from 4efd322 to e088832 Compare June 15, 2020 18:36

Applied diff of all changes to attempt to fix versioning failure with…

59589b0

… up-to-date master

justinmc force-pushed the characters branch from e088832 to 59589b0 Compare June 15, 2020 18:45

justinmc changed the title ~~WIP Characters Package~~ Characters Package Jun 15, 2020

justinmc removed the work in progress; do not review label Jun 15, 2020

HansMuller reviewed Jun 15, 2020

View reviewed changes

justinmc mentioned this pull request Jun 15, 2020

Potentially inaccurate surrogate pair detection #59513

Closed

Clarify boundary conditions of next/previousCharacter

b3cdde2

HansMuller approved these changes Jun 16, 2020

View reviewed changes

justinmc mentioned this pull request Jun 16, 2020

Expose the characters package to Flutter users for better unicode handling #55593

Closed

justinmc added the waiting for tree to go green label Jun 16, 2020

fluttergithubbot merged commit e0ed12c into flutter:master Jun 16, 2020

justinmc deleted the characters branch June 16, 2020 23:56

renyou added a commit that referenced this pull request Jun 17, 2020

Revert "Characters Package (#53381)"

7e42f8f

This reverts commit e0ed12c.

lrhn reviewed Jun 17, 2020

View reviewed changes

renyou added a commit that referenced this pull request Jun 17, 2020

Revert "Characters Package (#53381)" (#59677)

a99d146

This reverts commit e0ed12c.

pchampio mentioned this pull request Jun 17, 2020

utf-8 textinput go-flutter-desktop/go-flutter#478

Closed

justinmc mentioned this pull request Jun 17, 2020

Export characters #59620

Merged

justinmc added a commit to justinmc/flutter that referenced this pull request Jun 18, 2020

Revert "Revert "Characters Package (flutter#53381)" (flutter#59677)"

2ae6fff

This reverts commit a99d146.

justinmc mentioned this pull request Jun 18, 2020

Reland Characters Usage #59778

Merged

zljj0818 pushed a commit to zljj0818/flutter that referenced this pull request Jun 22, 2020

Characters Package (flutter#53381)

2a51ba6

zljj0818 pushed a commit to zljj0818/flutter that referenced this pull request Jun 22, 2020

Revert "Characters Package (flutter#53381)" (flutter#59677)

f5862b9

This reverts commit e0ed12c.

mingwandroid pushed a commit to mingwandroid/flutter that referenced this pull request Sep 6, 2020

Characters Package (flutter#53381)

433f25d

mingwandroid pushed a commit to mingwandroid/flutter that referenced this pull request Sep 6, 2020

Revert "Characters Package (flutter#53381)" (flutter#59677)

58c0faf

This reverts commit e0ed12c.

justinmc mentioned this pull request Oct 5, 2020

Characters docs #67361

Merged

github-actions bot locked as resolved and limited conversation to collaborators Jul 30, 2021

Characters Package #53381

Characters Package #53381

Uh oh!

Conversation

justinmc commented Mar 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Related Issues

Tests

Breaking Change

Uh oh!

justinmc commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justinmc commented Jun 9, 2020

Uh oh!

justinmc commented Jun 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinmc commented Jun 11, 2020

Uh oh!

justinmc commented Jun 15, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lrhn Jun 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justinmc Jun 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HansMuller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

justinmc commented Mar 27, 2020 •

edited

Loading

justinmc commented Apr 22, 2020 •

edited

Loading

justinmc commented Jun 9, 2020 •

edited

Loading

lrhn Jun 17, 2020 •

edited

Loading

justinmc Jun 18, 2020 •

edited

Loading