ICU-20187 remove obsolete locale ID canonicalization mappings by markusicu · Pull Request #418 · unicode-org/icu

markusicu · 2019-02-12T06:30:22Z

https://unicode-org.atlassian.net/browse/ICU-20187

approved:2018-11-14 remove obsolete locale ID canonicalization mappings

drop mappings for

EURO (which Java dropped long ago) & PREEURO (an ICU invention)
".Net names" (incomplete, and which .Net dropped long ago)
"Linux names" (similar to .Net names)
"old ICU names"
"Solaris variants" PINYIN (conflicts with an IANA variant) & STROKE
"POSIX names" C & POSIX (still handled by uprv_getPOSIXIDForCategory())

pedberg-icu · 2019-02-12T18:30:13Z

How does this affect handling of the en_US_POSIX locale which is still supported by CLDR and imported into ICU, and which is still used (by Apple, for one)? Will it affect the ability to use that locale with that locale ID? Will that locale still be included in eg. getLocales()? How will the changes affect canonicalization of that locale ID?

markusicu · 2019-02-12T18:35:10Z

How does this affect handling of the en_US_POSIX locale

I am not changing anything about that locale, except:

How will the changes affect canonicalization of that locale ID?

ICU uloc_canonicalize() will no longer map "C" or "POSIX" to "en_US_POSIX", but uprv_getPOSIXIDForCategory() still does so.

icu4c/source/common/putil.cpp

icu4c/source/test/cintltst/cloctst.c

FrankYFTang

comments about all the tests. will look at the implementation next.

icu4c/source/test/cintltst/cloctst.c

icu4c/source/test/intltest/loctest.cpp

icu4j/main/tests/core/src/com/ibm/icu/dev/test/util/ULocaleTest.java

FrankYFTang

about the implementation.

icu4c/source/common/putil.cpp

icu4j/main/classes/core/src/com/ibm/icu/util/ULocale.java

yumaoka

LGTM with one minor comment.

icu4c/source/test/intltest/calregts.cpp

jira-pull-request-webhook · 2019-02-14T16:38:47Z

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

markusicu · 2019-02-14T16:39:25Z

sorry, did a rebase on reflex... will address feedback in a few minutes

markusicu · 2019-02-14T18:23:13Z

PTAL

yumaoka

LGTM

jira-pull-request-webhook · 2019-02-14T19:27:25Z

Hooray! The files in the branch are the same across the force-push. 😃

~ Your Friendly Jira-GitHub PR Checker Bot

FrankYFTang

LGTM

srl295 · 2019-04-25T17:28:40Z

icu4c/source/common/uloc.cpp

 */
 static const CanonicalizationMap CANONICALIZE_MAP[] = {
-    { "",               "en_US_POSIX", NULL, NULL }, /* .NET name */
-    { "c",              "en_US_POSIX", NULL, NULL }, /* POSIX name */


This is the cause of regression https://unicode-org.atlassian.net/browse/ICU-20575 - "c" should NOT have been removed here.

ICU uloc_canonicalize() will no longer map "C" or "POSIX" to "en_US_POSIX", but uprv_getPOSIXIDForCategory() still does so.

two different comparisons were conflated. the strcmp("C") was to check whether the posix layer was returning real data or not. The mapping still needed to be done in uloc_canonicalize()

Regression was in 1afef30 PR unicode-org#418 [ICU-20187] - We dropped the mapping from "C" in uloc_canonicalize, but then putil did not handle cases where a codepage was set (such as C.UTF-8). - Add an additional check in uprv_getDefaultLocaleID() for locales that end up as "C" or "POSIX" after removing codepage suffix. - Also fix regression where aa@bb would become aa__BB__BB (incorrectly doubled __BB)

Regression was in 1afef30 PR #418 [ICU-20187] - We dropped the mapping from "C" in uloc_canonicalize, but then putil did not handle cases where a codepage was set (such as C.UTF-8). - Add an additional check in uprv_getDefaultLocaleID() for locales that end up as "C" or "POSIX" after removing codepage suffix. - Also fix regression where aa@bb would become aa__BB__BB (incorrectly doubled __BB)

Regression was in 1afef30 PR unicode-org#418 [ICU-20187] - We dropped the mapping from "C" in uloc_canonicalize, but then putil did not handle cases where a codepage was set (such as C.UTF-8). - Add an additional check in uprv_getDefaultLocaleID() for locales that end up as "C" or "POSIX" after removing codepage suffix. - Also fix regression where aa@bb would become aa__BB__BB (incorrectly doubled __BB) (cherry picked from commit 075cefb)

Regression was in 1afef30 PR #418 [ICU-20187] - We dropped the mapping from "C" in uloc_canonicalize, but then putil did not handle cases where a codepage was set (such as C.UTF-8). - Add an additional check in uprv_getDefaultLocaleID() for locales that end up as "C" or "POSIX" after removing codepage suffix. - Also fix regression where aa@bb would become aa__BB__BB (incorrectly doubled __BB) (cherry picked from commit 075cefb)

markusicu assigned yumaoka Feb 12, 2019

markusicu requested review from FrankYFTang, gvictor, pedberg-icu and srl295 February 12, 2019 06:30

markusicu commented Feb 12, 2019

View reviewed changes

icu4c/source/common/putil.cpp Show resolved Hide resolved

icu4c/source/common/putil.cpp Outdated Show resolved Hide resolved

FrankYFTang reviewed Feb 12, 2019

View reviewed changes

icu4c/source/test/cintltst/cloctst.c Show resolved Hide resolved

FrankYFTang reviewed Feb 12, 2019

View reviewed changes

icu4c/source/test/cintltst/cloctst.c Show resolved Hide resolved

FrankYFTang reviewed Feb 12, 2019

View reviewed changes

icu4c/source/test/cintltst/cloctst.c Show resolved Hide resolved

FrankYFTang reviewed Feb 12, 2019

View reviewed changes

FrankYFTang requested changes Feb 12, 2019

View reviewed changes

yumaoka previously approved these changes Feb 13, 2019

View reviewed changes

icu4c/source/test/intltest/calregts.cpp Show resolved Hide resolved

srl295 previously approved these changes Feb 13, 2019

View reviewed changes

markusicu dismissed stale reviews from srl295 and yumaoka via ca72da3 February 14, 2019 16:38

markusicu force-pushed the long-obsolete-variants branch from 9b35ea6 to ca72da3 Compare February 14, 2019 16:38

yumaoka previously approved these changes Feb 14, 2019

View reviewed changes

ICU-20187 drop support for long-obsolete locale ID variants

cb71cbc

markusicu dismissed yumaoka’s stale review via cb71cbc February 14, 2019 19:27

markusicu force-pushed the long-obsolete-variants branch from b4d62d8 to cb71cbc Compare February 14, 2019 19:27

FrankYFTang approved these changes Feb 14, 2019

View reviewed changes

markusicu merged commit 1afef30 into unicode-org:master Feb 14, 2019

markusicu deleted the long-obsolete-variants branch February 14, 2019 20:27

srl295 reviewed Apr 25, 2019

View reviewed changes

srl295 mentioned this pull request Apr 25, 2019

ICU-20575 fix broken default locale mapping for C.UTF-8 #634

Merged

3 tasks

srl295 mentioned this pull request Apr 25, 2019

ICU-20575 locale mapping for C.UTF-8 - 64-maint backport #637

Merged

3 tasks

Uh oh!

Comments

Conversation

markusicu commented Feb 12, 2019

Uh oh!

pedberg-icu commented Feb 12, 2019

Uh oh!

markusicu commented Feb 12, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FrankYFTang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FrankYFTang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yumaoka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jira-pull-request-webhook bot commented Feb 14, 2019

Uh oh!

markusicu commented Feb 14, 2019

Uh oh!

markusicu commented Feb 14, 2019

Uh oh!

yumaoka left a comment

Choose a reason for hiding this comment

Uh oh!

jira-pull-request-webhook bot commented Feb 14, 2019

Uh oh!

FrankYFTang left a comment

Choose a reason for hiding this comment

Uh oh!

srl295 Apr 25, 2019

Choose a reason for hiding this comment

Uh oh!

srl295 Apr 25, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants