Skip to content

feat: sort languages in selection menu#1071

Merged
osteotek merged 12 commits intocrosspoint-reader:masterfrom
ariel-lindemann:feature/sort_languages
Feb 25, 2026
Merged

feat: sort languages in selection menu#1071
osteotek merged 12 commits intocrosspoint-reader:masterfrom
ariel-lindemann:feature/sort_languages

Conversation

@ariel-lindemann
Copy link
Contributor

@ariel-lindemann ariel-lindemann commented Feb 22, 2026

Summary

  • What is the goal of this PR? (e.g., Implements the new feature for file uploading.)

Currently we are displaying the languages in the order they were added (as in the Language enum). However, as new languages are coming in, this will quickly be confusing to the users.

But we can't just change the ordering of the enum if we want to respect bakwards compatibility.

So my proposal is to add a mapping of the alphabetical order of the languages. I've made it so that it's generated by the gen_i18n.py script, which will be used when a new language is added.

  • What changes are included?

Added the array from the python script and changed LanguageSelectActivity to use the indices from there. Also commited the generated I18nKeys.h

Additional Context

  • Add any other information that might be helpful for the reviewer (e.g., performance implications, potential risks,
    specific areas to focus on).

I was wondering if there is a better way to sort it. Currently, it's by unicode value and Czech and Russian are last, which I don't know it it's the most intuitive.

The current order is:
Català, Deutsch, English, Español, Français, Português (Brasil), Română, Svenska, Čeština, Русский


AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? < PARTIALLY >

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Rename language enum values to ISO codes, add a generated constexpr array SORTED_LANGUAGE_INDICES (order {0,9,4,3,1,2,5,8,6,7}), update gen_i18n.py to emit the array and language-code metadata, and adapt LanguageSelectActivity to display and map selections via the sorted indices. (39 words)

Changes

Cohort / File(s) Summary
I18n keys & translations
lib/I18n/I18nKeys.h, lib/I18n/I18n.cpp, lib/I18n/I18n.h, lib/I18n/translations/.../*.yaml
Rename Language enum values from verbose names to ISO codes (EN, ES, FR, DE, CS, PT, RU, SV, RO, CA). Add constexpr uint8_t SORTED_LANGUAGE_INDICES[] = {0,9,4,3,1,2,5,8,6,7} in I18nKeys.h. Update fallback/default language references to use ISO codes. Translation YAML _language_code values updated to ISO codes.
I18n generator script
scripts/gen_i18n.py
Emit language metadata using ISO codes, detect English by _language_code == "EN", compute and write SORTED_LANGUAGE_INDICES and a human-readable order comment into generated header.
Language selection UI
src/activities/settings/LanguageSelectActivity.h, src/activities/settings/LanguageSelectActivity.cpp
Change per-instance totalItems to constexpr static uint8_t totalItems = getLanguageCount(); iterate and map UI entries through SORTED_LANGUAGE_INDICES; compute selectedIndex via std::find/std::distance; map selection back to Language using the sorted indices.
Documentation
docs/i18n.md
Update examples and metadata in docs to use ISO language codes (EN, ES, FR, etc.) and reflect API usage with new enum names.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant User as User
participant UI as LanguageSelectActivity
participant I18N as I18n
participant Settings as SettingsManager
User->>UI: open language selector
UI->>I18N: read SORTED_LANGUAGE_INDICES, getLanguageCount()
I18N-->>UI: sorted indices & language names
UI->>User: render list (sorted order)
User->>UI: select item (index i)
UI->>I18N: map SORTED_LANGUAGE_INDICES[i] -> Language enum
UI->>Settings: setLanguage(mapped Language)
Settings-->>I18N: persist language change
I18N-->>UI: confirm update

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • osteotek
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: sort languages in selection menu' accurately summarizes the main change: reordering how languages are displayed in the selection menu using a generated mapping.
Description check ✅ Passed The description is well-related to the changeset, explaining the goal (displaying languages in alphabetical order while preserving backward compatibility), what changes were made (adding a generated mapping and updating LanguageSelectActivity), and additional context.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
scripts/gen_i18n.py (1)

441-452: Consider a locale/diacritic‑insensitive sort key for display order.

Right now the sort is raw Unicode code‑point order, which will keep accented and non‑Latin names clustered at the end. If the goal is a more intuitive alphabetical list, normalize/casefold (or provide a collation key) before sorting.

♻️ Optional normalization-based sort key
+import unicodedata
+def _sort_key(name: str) -> str:
+    normalized = unicodedata.normalize("NFKD", name)
+    stripped = "".join(ch for ch in normalized if not unicodedata.combining(ch))
+    return stripped.casefold()
-
-sorted_indices = sorted(range(len(language_names)), key=lambda i: language_names[i])
+sorted_indices = sorted(range(len(language_names)), key=lambda i: _sort_key(language_names[i]))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/gen_i18n.py` around lines 441 - 452, The current sort for display
order uses raw Unicode code points which groups accented/ non‑Latin names
incorrectly; update sorted_indices in gen_i18n.py to sort by a
locale/diacritic‑insensitive key instead of plain language_names[i] — e.g.
import unicodedata and compute a key like unicodedata.normalize("NFKD",
language_names[i]).casefold() (or use locale.strxfrm after setting locale) in
the lambda used to build sorted_indices so the indices remain stable but names
are compared using the normalized/casefolded/collation key; keep the rest of the
logic (comment_names, SORTED_LANGUAGE_INDICES) unchanged so indices map
correctly to the original language_names.
lib/I18n/I18nKeys.h (1)

392-396: Add a compile‑time size guard for SORTED_LANGUAGE_INDICES.

If the generated array ever drifts from Language::_COUNT, UI loops can index out of bounds. Since this file is generated, please add the guard in gen_i18n.py.

🛡️ Suggested compile‑time guard
 constexpr uint8_t SORTED_LANGUAGE_INDICES[] = {9, 3, 0, 1, 2, 5, 8, 7, 4, 6};
+constexpr uint8_t SORTED_LANGUAGE_INDICES_COUNT =
+    sizeof(SORTED_LANGUAGE_INDICES) / sizeof(SORTED_LANGUAGE_INDICES[0]);
+static_assert(SORTED_LANGUAGE_INDICES_COUNT == getLanguageCount(),
+              "SORTED_LANGUAGE_INDICES size mismatch");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/I18n/I18nKeys.h` around lines 392 - 396, The generated
SORTED_LANGUAGE_INDICES array may drift from Language::_COUNT causing OOB UI
indexing; update gen_i18n.py to emit a compile-time size guard after the array
declaration (e.g., a static_assert comparing
sizeof(SORTED_LANGUAGE_INDICES)/sizeof(SORTED_LANGUAGE_INDICES[0]) to
static_cast<size_t>(Language::_COUNT)). Ensure the guard references
SORTED_LANGUAGE_INDICES and Language::_COUNT so compilation fails if the
generated array length and enum count diverge.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e32d41a and b1f07c2.

📒 Files selected for processing (4)
  • lib/I18n/I18nKeys.h
  • scripts/gen_i18n.py
  • src/activities/settings/LanguageSelectActivity.cpp
  • src/activities/settings/LanguageSelectActivity.h
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-02-16T22:25:35.674Z
Learnt from: whyte-j
Repo: crosspoint-reader/crosspoint-reader PR: 733
File: lib/I18n/I18nKeys.h:271-272
Timestamp: 2026-02-16T22:25:35.674Z
Learning: In the crosspoint-reader i18n system (lib/I18n/), missing translation keys in non-English YAML files are automatically filled with English strings at build time by the gen_i18n.py script. All generated string arrays (STRINGS_EN, STRINGS_ES, etc.) have identical lengths, so runtime array indexing is safe without per-string fallback logic. The system prints "INFO: '{key}' missing in {lang_code}, using English fallback" during code generation when this occurs.

Applied to files:

  • lib/I18n/I18nKeys.h
  • scripts/gen_i18n.py
🧬 Code graph analysis (1)
src/activities/settings/LanguageSelectActivity.h (1)
lib/I18n/I18nKeys.h (1)
  • getLanguageCount (391-391)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: cppcheck
🔇 Additional comments (4)
src/activities/settings/LanguageSelectActivity.h (1)

31-32: Nice: totalItems now constexpr static.

This keeps the count tied to getLanguageCount() and avoids per‑instance storage.

src/activities/settings/LanguageSelectActivity.cpp (3)

15-19: LGTM: current-language lookup into the sorted list.


49-52: LGTM: selection maps through SORTED_LANGUAGE_INDICES.


68-84: LGTM: render loop uses the sorted mapping consistently.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/activities/settings/LanguageSelectActivity.cpp`:
- Around line 6-7: The file LanguageSelectActivity.cpp uses std::begin, std::end
and std::distance but only includes <algorithm>; add an explicit `#include`
<iterator> to the top of the file so the declarations for std::begin, std::end
and std::distance are available portably (update the include block near the
existing `#include` <algorithm> to also include <iterator>).

---

Nitpick comments:
In `@lib/I18n/I18nKeys.h`:
- Around line 392-396: The generated SORTED_LANGUAGE_INDICES array may drift
from Language::_COUNT causing OOB UI indexing; update gen_i18n.py to emit a
compile-time size guard after the array declaration (e.g., a static_assert
comparing sizeof(SORTED_LANGUAGE_INDICES)/sizeof(SORTED_LANGUAGE_INDICES[0]) to
static_cast<size_t>(Language::_COUNT)). Ensure the guard references
SORTED_LANGUAGE_INDICES and Language::_COUNT so compilation fails if the
generated array length and enum count diverge.

In `@scripts/gen_i18n.py`:
- Around line 441-452: The current sort for display order uses raw Unicode code
points which groups accented/ non‑Latin names incorrectly; update sorted_indices
in gen_i18n.py to sort by a locale/diacritic‑insensitive key instead of plain
language_names[i] — e.g. import unicodedata and compute a key like
unicodedata.normalize("NFKD", language_names[i]).casefold() (or use
locale.strxfrm after setting locale) in the lambda used to build sorted_indices
so the indices remain stable but names are compared using the
normalized/casefolded/collation key; keep the rest of the logic (comment_names,
SORTED_LANGUAGE_INDICES) unchanged so indices map correctly to the original
language_names.

@Uri-Tauber
Copy link
Contributor

@ariel-lindemann I think English should always be the first.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
scripts/gen_i18n.py (1)

443-457: Clarify sorting behavior or implement locale-aware collation for intuitive alphabetical order.

The current implementation uses Python's default Unicode codepoint ordering, which places accented characters (Čeština) and Cyrillic (Русский) at the end rather than in alphabetically intuitive positions. The resulting order is: English, Català, Deutsch, Español, Français, Português (Brasil), Română, Svenska, Čeština, Русский—where Čeština appears after Svenska instead of between Català and Deutsch.

Either implement locale-aware collation (e.g., using locale.strxfrm() or icu) to match user expectations, or update the comment to clarify this is "Unicode codepoint sorted" rather than "alphabetically sorted."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/gen_i18n.py` around lines 443 - 457, The sorting currently uses
Unicode codepoint order (see english_idx, rest, sorted_indices and
language_names in gen_i18n.py) which misplaces accented/Cyrillic names; either
replace the key=lambda with a locale-aware transform (e.g., set an appropriate
locale and use key=lambda i: locale.strxfrm(language_names[i]) or use an
ICU-based collation) so names sort in expected alphabetical order, or if you
don't want locale logic, change the generated comment lines that mention
"Alphabetically sorted language indices" and the "// Order:" to explicitly state
"Unicode codepoint sorted" so behavior is clear.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 161c05e and 3bd614f.

📒 Files selected for processing (2)
  • lib/I18n/I18nKeys.h
  • scripts/gen_i18n.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-02-16T22:25:35.674Z
Learnt from: whyte-j
Repo: crosspoint-reader/crosspoint-reader PR: 733
File: lib/I18n/I18nKeys.h:271-272
Timestamp: 2026-02-16T22:25:35.674Z
Learning: In the crosspoint-reader i18n system (lib/I18n/), missing translation keys in non-English YAML files are automatically filled with English strings at build time by the gen_i18n.py script. All generated string arrays (STRINGS_EN, STRINGS_ES, etc.) have identical lengths, so runtime array indexing is safe without per-string fallback logic. The system prints "INFO: '{key}' missing in {lang_code}, using English fallback" during code generation when this occurs.

Applied to files:

  • lib/I18n/I18nKeys.h
  • scripts/gen_i18n.py
🪛 Ruff (0.15.1)
scripts/gen_i18n.py

[warning] 449-449: Consider [english_idx, *rest] instead of concatenation

Replace with [english_idx, *rest]

(RUF005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
lib/I18n/I18nKeys.h (1)

392-395: LGTM: sorted indices constant is clear and self-describing.

The generated order comment plus the constexpr array make the intended display order explicit and easy to consume.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/gen_i18n.py`:
- Around line 443-457: The sorting currently uses Unicode codepoint order (see
english_idx, rest, sorted_indices and language_names in gen_i18n.py) which
misplaces accented/Cyrillic names; either replace the key=lambda with a
locale-aware transform (e.g., set an appropriate locale and use key=lambda i:
locale.strxfrm(language_names[i]) or use an ICU-based collation) so names sort
in expected alphabetical order, or if you don't want locale logic, change the
generated comment lines that mention "Alphabetically sorted language indices"
and the "// Order:" to explicitly state "Unicode codepoint sorted" so behavior
is clear.

@ariel-lindemann
Copy link
Contributor Author

@Uri-Tauber makes sense. I've adjusted it accordingly. Would appreciate imput on sorting non ASCII and non latin characters. I see three pahts:

  • leave as is (sorted by unicode values)
  • sort latin modified characters with unmodified ones and non-latin separately (C with Č)
  • find a way to logically transliterate non-latin characters internally for sorting (so cyrillic Р gets sorted as latin R)

@Uri-Tauber
Copy link
Contributor

@ariel-lindemann I think this might be getting overcomplicated. At the moment, the order is determined by the _order key in each YAML file. The simplest approach would be to update the Python script to ignore that key and instead generate a sorted list based on your desired logic (PyICU could probably help here).

This way, you avoid adding extra code to the firmware and keep the sorting logic contained in the compilation.

@osteotek
Copy link
Member

osteotek commented Feb 22, 2026

@Uri-Tauber makes sense. I've adjusted it accordingly. Would appreciate imput on sorting non ASCII and non latin characters. I see three pahts:

  • leave as is (sorted by unicode values)
  • sort latin modified characters with unmodified ones and non-latin separately (C with Č)
  • find a way to logically transliterate non-latin characters internally for sorting (so cyrillic Р gets sorted as latin R)

just sort it manually? There are not that many languages to introduce complicated logic. Feels like just _order key needs to be updated in desired sort order

@ariel-lindemann
Copy link
Contributor Author

@ariel-lindemann I think this might be getting overcomplicated. At the moment, the order is determined by the _order key in each YAML file. The simplest approach would be to update the Python script to ignore that key and instead generate a sorted list based on your desired logic (PyICU could probably help here).

This way, you avoid adding extra code to the firmware and keep the sorting logic contained in the compilation.

@Uri-Tauber that's what I'm already doing. All the sorting is being done in python so at compile time the order is already set.
I've decided for the middle option which i think is balanced between intuitive and not as complicated to implement

Thank you for the input!

@ariel-lindemann
Copy link
Contributor Author

@Uri-Tauber makes sense. I've adjusted it accordingly. Would appreciate imput on sorting non ASCII and non latin characters. I see three pahts:

  • leave as is (sorted by unicode values)
  • sort latin modified characters with unmodified ones and non-latin separately (C with Č)
  • find a way to logically transliterate non-latin characters internally for sorting (so cyrillic Р gets sorted as latin R)

just sort it manually? There are not that many languages to introduce complicated logic. Feels like just _order key needs to be updated in desired sort order

@osteotek the idea is that the script would do the sorting. Otherwise, each time a new translation comes in, the translator would have to change the indexes manually which he might forget and is error prone anyway.

This way it is done inside the script which is used anyway when adding a new language. And recently there have been many new translations coming in (#1049, #987, #1065) and I anticipate there will be more (eg Italian, Chinese, Arabic, Hebrew). So I think it makes sense to have this automated.

So there are really no costs:

  • no additional burden for the translators
  • no calculations at runtime (everything is calculated before compilation)

@osteotek
Copy link
Member

Ok, I'm fine with that approach. Maybe remove _order key from language files then?

I'm personally fine with Cyrillic languages being at the end of the list

@osteotek
Copy link
Member

Another idea: instead of sorting on language strings directly, sort on their language keys (uk, ru, en, it, etc). This would solve problem with diacritics and Cyrillic languages

@ariel-lindemann
Copy link
Contributor Author

Ok, I'm fine with that approach. Maybe remove _order key from language files then?

I'm personally fine with Cyrillic languages being at the end of the list

@osteotek we can't remove the _order key without affecting the composition of the enum and breaking backwards compatibility. Because if we sorted directly on the enum, once a new language is added and the user upgrades, he may find the UI in a language he doesnt understand. That's why the script is creating the SORTED_LANGUAGE_INDICES array instead of moditfying the enum.

@mirus-ua
Copy link
Contributor

Another idea: instead of sorting on language strings directly, sort on their language keys (uk, ru, en, it, etc). This would solve problem with diacritics and Cyrillic languages

I came here to write a similar idea.
Basically, the most important part is determinated order, so if we use a new value that represants language codes (uk, en, etc) or even existing _language_code, it'll be OK because the order always follows the same rules

@ariel-lindemann
Copy link
Contributor Author

Another idea: instead of sorting on language strings directly, sort on their language keys (uk, ru, en, it, etc). This would solve problem with diacritics and Cyrillic languages

I've tried this locally, with the _language_code key and the resulting order is this:

´English, Català, Čeština, Français, Deutsch, Português (Brasil), Română, Русский, Español, Svenska´

So basically it's sorted by the english translation, which seems a bit arbitraty. But maybe we should change the key to the ISO language code so that would make things clearer

@ariel-lindemann
Copy link
Contributor Author

I propose we use ISO language designations in _language_code. I think that is most straight forward

@osteotek
Copy link
Member

Another idea: instead of sorting on language strings directly, sort on their language keys (uk, ru, en, it, etc). This would solve problem with diacritics and Cyrillic languages

I've tried this locally, with the _language_code key and the resulting order is this:

´English, Català, Čeština, Français, Deutsch, Português (Brasil), Română, Русский, Español, Svenska´

So basically it's sorted by the english translation, which seems a bit arbitraty. But maybe we should change the key to the ISO language code so that would make things clearer

All sorting methods would seem arbitrary in one way or another. This one looks ok to me

@mirus-ua
Copy link
Contributor

Another idea: instead of sorting on language strings directly, sort on their language keys (uk, ru, en, it, etc). This would solve problem with diacritics and Cyrillic languages

I've tried this locally, with the _language_code key and the resulting order is this:

´English, Català, Čeština, Français, Deutsch, Português (Brasil), Română, Русский, Español, Svenska´

So basically it's sorted by the english translation, which seems a bit arbitraty. But maybe we should change the key to the ISO language code so that would make things clearer

Realistically, I see two ways to do it, either use ISO codes or sort separatly by alphabet type, because the approach with translation is confusing, IMO

@ariel-lindemann
Copy link
Contributor Author

I've implemented it with ISO codes now and also wrote it in the docs. I think this is the most standardized we can get.

Thank you for your constructive input!

the order is now: English, Català, Čeština, Deutsch, Español, Français, Português (Brasil), Română, Русский, Svenska

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/gen_i18n.py (1)

224-224: ⚠️ Potential issue | 🟠 Major

Fix Czech ISO code mapping.

The LANG_ABBREVIATIONS dictionary maps "čeština" to "CZ", but the ISO 639-1 code for Czech is "CS". "CZ" is the country code (ISO 3166-1), not the language code. This mapping is actively used across the script to generate C++ symbol names (e.g., STRINGS_CZ instead of STRINGS_CS), and it conflicts with the language code in czech.yaml which correctly uses "CS". Correct this to align with ISO 639-1 standards and the PR's stated goal.

🔧 Proposed fix
-    "čeština": "CZ",
+    "čeština": "CS",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/gen_i18n.py` at line 224, Update the LANG_ABBREVIATIONS mapping so
the Czech entry maps "čeština" to the ISO 639-1 language code "CS" instead of
"CZ"; locate the LANG_ABBREVIATIONS dictionary in scripts/gen_i18n.py (the entry
currently `"čeština": "CZ"`) and change the value to `"CS"` to ensure generated
C++ symbols (e.g., STRINGS_CS) and alignment with czech.yaml use the correct ISO
language code.
🧹 Nitpick comments (2)
lib/I18n/I18nKeys.h (1)

393-395: Add a compile-time size guard for SORTED_LANGUAGE_INDICES.
Helps catch mismatches if a language is added/removed but the generated list falls out of sync.

♻️ Proposed guard (should be emitted by gen_i18n.py)
 constexpr uint8_t SORTED_LANGUAGE_INDICES[] = {0, 9, 4, 3, 1, 2, 5, 8, 6, 7};
+static_assert(sizeof(SORTED_LANGUAGE_INDICES) / sizeof(SORTED_LANGUAGE_INDICES[0]) ==
+              getLanguageCount(),
+              "SORTED_LANGUAGE_INDICES must match Language::_COUNT");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/I18n/I18nKeys.h` around lines 393 - 395, Add a compile-time size guard
for the generated SORTED_LANGUAGE_INDICES array to detect when the number of
entries drifts from the expected language count; define or reference the
canonical language count constant (e.g., NUM_LANGUAGES or LANGUAGE_COUNT) used
by the i18n system and add a static_assert that
sizeof(SORTED_LANGUAGE_INDICES)/sizeof(SORTED_LANGUAGE_INDICES[0]) ==
NUM_LANGUAGES with a clear error message so gen_i18n.py emits this assertion
alongside the constexpr array (referencing SORTED_LANGUAGE_INDICES and the
chosen NUM_LANGUAGES symbol).
scripts/gen_i18n.py (1)

28-28: Unused import.

The unicodedata module is imported but doesn't appear to be used anywhere in the file.

🧹 Proposed fix
-import unicodedata
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/gen_i18n.py` at line 28, The import unicodedata in
scripts/gen_i18n.py is unused; remove the line "import unicodedata" from the top
of the file (leaving any other imports intact) so there are no unnecessary
imports in the module.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5263b4f and ad74743.

📒 Files selected for processing (15)
  • docs/i18n.md
  • lib/I18n/I18n.cpp
  • lib/I18n/I18n.h
  • lib/I18n/I18nKeys.h
  • lib/I18n/translations/catalan.yaml
  • lib/I18n/translations/czech.yaml
  • lib/I18n/translations/english.yaml
  • lib/I18n/translations/french.yaml
  • lib/I18n/translations/german.yaml
  • lib/I18n/translations/portuguese.yaml
  • lib/I18n/translations/romanian.yaml
  • lib/I18n/translations/russian.yaml
  • lib/I18n/translations/spanish.yaml
  • lib/I18n/translations/swedish.yaml
  • scripts/gen_i18n.py
✅ Files skipped from review due to trivial changes (1)
  • lib/I18n/translations/catalan.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-02-16T22:25:35.674Z
Learnt from: whyte-j
Repo: crosspoint-reader/crosspoint-reader PR: 733
File: lib/I18n/I18nKeys.h:271-272
Timestamp: 2026-02-16T22:25:35.674Z
Learning: In the crosspoint-reader i18n system (lib/I18n/), missing translation keys in non-English YAML files are automatically filled with English strings at build time by the gen_i18n.py script. All generated string arrays (STRINGS_EN, STRINGS_ES, etc.) have identical lengths, so runtime array indexing is safe without per-string fallback logic. The system prints "INFO: '{key}' missing in {lang_code}, using English fallback" during code generation when this occurs.

Applied to files:

  • lib/I18n/translations/portuguese.yaml
  • lib/I18n/translations/spanish.yaml
  • lib/I18n/translations/french.yaml
  • lib/I18n/translations/german.yaml
  • lib/I18n/translations/russian.yaml
  • lib/I18n/I18nKeys.h
  • lib/I18n/translations/english.yaml
  • scripts/gen_i18n.py
  • docs/i18n.md
🧬 Code graph analysis (2)
lib/I18n/I18nKeys.h (1)
lib/I18n/I18nStrings.h (1)
  • i18n_strings (8-21)
scripts/gen_i18n.py (1)
lib/I18n/I18n.cpp (2)
  • get (20-29)
  • get (20-20)
🪛 Ruff (0.15.1)
scripts/gen_i18n.py

[warning] 140-140: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 451-451: Consider [english_idx, *rest] instead of concatenation

Replace with [english_idx, *rest]

(RUF005)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cppcheck
  • GitHub Check: build
🔇 Additional comments (16)
lib/I18n/translations/czech.yaml (1)

2-2: LGTM: ISO code aligns with the new enum.

lib/I18n/translations/portuguese.yaml (1)

2-2: LGTM: ISO code update is consistent.

lib/I18n/translations/swedish.yaml (1)

2-2: LGTM: ISO code aligns with the new enum.

lib/I18n/I18nKeys.h (2)

22-31: LGTM: ISO-based enum names with stable underlying values.


365-384: LGTM: switch mapping updated consistently.

lib/I18n/translations/russian.yaml (1)

2-2: LGTM: ISO code update is consistent.

lib/I18n/translations/german.yaml (1)

2-2: LGTM: ISO code aligns with the new enum.

lib/I18n/translations/romanian.yaml (1)

2-2: LGTM: ISO code update is consistent.

lib/I18n/I18n.cpp (1)

90-93: LGTM: fallback aligns with the renamed enum.

lib/I18n/translations/english.yaml (1)

2-2: LGTM!

The language code update from verbose name to ISO 639-1 code "EN" is correct and aligns with the PR's objective of using standardized ISO codes for deterministic sorting.

lib/I18n/I18n.h (1)

35-35: LGTM!

The default language initialization updated to use the new ISO-based enum value Language::EN, consistent with the enum rename across the codebase.

lib/I18n/translations/french.yaml (1)

2-2: LGTM!

The language code update to ISO 639-1 code "FR" is correct for French.

lib/I18n/translations/spanish.yaml (1)

2-2: LGTM!

The language code update to ISO 639-1 code "ES" is correct for Spanish.

docs/i18n.md (1)

54-55: LGTM!

Documentation updates are comprehensive and consistent:

  • Examples updated to use ISO codes (ES, IT, FR)
  • Helpful reference link to ISO 639 language codes added
  • API usage examples correctly reflect the new enum values

Also applies to: 64-65, 130-131, 177-178, 191-192, 203-204

scripts/gen_i18n.py (2)

446-451: LGTM with minor style suggestion.

The sorted language indices logic correctly places English first and sorts remaining languages alphabetically by ISO code. The languages.index("EN") call is safe because load_translations already validates that an English file with _language_code: "EN" exists (lines 139-140).

Static analysis suggests using [english_idx, *rest] instead of concatenation for slightly more idiomatic Python, but this is a minor stylistic preference.

💅 Optional style improvement
-    sorted_indices = [english_idx] + rest
+    sorted_indices = [english_idx, *rest]

453-459: LGTM!

The generated SORTED_LANGUAGE_INDICES array with accompanying comment documenting the order provides clear visibility into the sort order for maintainers.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@scripts/gen_i18n.py`:
- Line 224: Update the LANG_ABBREVIATIONS mapping so the Czech entry maps
"čeština" to the ISO 639-1 language code "CS" instead of "CZ"; locate the
LANG_ABBREVIATIONS dictionary in scripts/gen_i18n.py (the entry currently
`"čeština": "CZ"`) and change the value to `"CS"` to ensure generated C++
symbols (e.g., STRINGS_CS) and alignment with czech.yaml use the correct ISO
language code.

---

Nitpick comments:
In `@lib/I18n/I18nKeys.h`:
- Around line 393-395: Add a compile-time size guard for the generated
SORTED_LANGUAGE_INDICES array to detect when the number of entries drifts from
the expected language count; define or reference the canonical language count
constant (e.g., NUM_LANGUAGES or LANGUAGE_COUNT) used by the i18n system and add
a static_assert that
sizeof(SORTED_LANGUAGE_INDICES)/sizeof(SORTED_LANGUAGE_INDICES[0]) ==
NUM_LANGUAGES with a clear error message so gen_i18n.py emits this assertion
alongside the constexpr array (referencing SORTED_LANGUAGE_INDICES and the
chosen NUM_LANGUAGES symbol).

In `@scripts/gen_i18n.py`:
- Line 28: The import unicodedata in scripts/gen_i18n.py is unused; remove the
line "import unicodedata" from the top of the file (leaving any other imports
intact) so there are no unnecessary imports in the module.

osteotek
osteotek previously approved these changes Feb 23, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements alphabetical sorting of languages in the language selection menu while maintaining backward compatibility with saved settings. The implementation moves from using full language names (e.g., "ENGLISH") to ISO 639-1 two-letter codes (e.g., "EN") for better internationalization standards compliance.

Changes:

  • Added auto-generated SORTED_LANGUAGE_INDICES array to display languages alphabetically by ISO code (English first, then others sorted)
  • Changed all language codes in YAML files and generated C++ code from full names to ISO 639-1 codes
  • Updated LanguageSelectActivity to use the sorted indices for menu display while preserving the original enum order for backward compatibility

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
scripts/gen_i18n.py Added logic to generate SORTED_LANGUAGE_INDICES array with English first, then alphabetically by ISO code; updated language code references from full names to ISO codes
lib/I18n/I18nKeys.h Generated file with renamed enum values (ENGLISH→EN, etc.), new SORTED_LANGUAGE_INDICES array, and static_assert for size validation
lib/I18n/I18nStrings.h Updated string array declarations to use ISO code abbreviations (STRINGS_CZ→STRINGS_CS)
lib/I18n/I18n.h Updated default language initialization to use new enum name (Language::EN)
lib/I18n/I18n.cpp Updated fallback language references to use new enum name (Language::EN)
src/activities/settings/LanguageSelectActivity.h Changed totalItems from instance variable to constexpr static compile-time constant
src/activities/settings/LanguageSelectActivity.cpp Added algorithm/iterator includes; implemented sorting logic using SORTED_LANGUAGE_INDICES for display and selection
lib/I18n/translations/*.yaml Changed _language_code from full names to ISO 639-1 codes (ENGLISH→EN, SPANISH→ES, CZECH→CS, etc.)
docs/i18n.md Updated documentation to reflect ISO code convention and provided examples with new enum names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

osteotek
osteotek previously approved these changes Feb 23, 2026
osteotek
osteotek previously approved these changes Feb 24, 2026
@osteotek osteotek requested a review from a team February 25, 2026 15:31
@osteotek osteotek merged commit 7e214ea into crosspoint-reader:master Feb 25, 2026
6 checks passed
lukestein pushed a commit to lukestein/crosspoint-reader that referenced this pull request Feb 26, 2026
## Summary

* **What is the goal of this PR?** (e.g., Implements the new feature for
file uploading.)

Currently we are displaying the languages in the order they were added
(as in the `Language` enum). However, as new languages are coming in,
this will quickly be confusing to the users.

But we can't just change the ordering of the enum if we want to respect
bakwards compatibility.

So my proposal is to add a mapping of the alphabetical order of the
languages. I've made it so that it's generated by the `gen_i18n.py`
script, which will be used when a new language is added.


* **What changes are included?**

Added the array from the python script and changed
`LanguageSelectActivity` to use the indices from there. Also commited
the generated `I18nKeys.h`

## Additional Context

* Add any other information that might be helpful for the reviewer
(e.g., performance implications, potential risks,
  specific areas to focus on).

I was wondering if there is a better way to sort it. Currently, it's by
unicode value and Czech and Russian are last, which I don't know it it's
the most intuitive.

The current order is:
`Català, Deutsch, English, Español, Français, Português (Brasil),
Română, Svenska, Čeština, Русский`

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**< PARTIALLY >**_
el pushed a commit to el/crosspoint-reader that referenced this pull request Feb 26, 2026
## Summary

* **What is the goal of this PR?** (e.g., Implements the new feature for
file uploading.)

Currently we are displaying the languages in the order they were added
(as in the `Language` enum). However, as new languages are coming in,
this will quickly be confusing to the users.

But we can't just change the ordering of the enum if we want to respect
bakwards compatibility.

So my proposal is to add a mapping of the alphabetical order of the
languages. I've made it so that it's generated by the `gen_i18n.py`
script, which will be used when a new language is added.


* **What changes are included?**

Added the array from the python script and changed
`LanguageSelectActivity` to use the indices from there. Also commited
the generated `I18nKeys.h`

## Additional Context

* Add any other information that might be helpful for the reviewer
(e.g., performance implications, potential risks,
  specific areas to focus on).

I was wondering if there is a better way to sort it. Currently, it's by
unicode value and Czech and Russian are last, which I don't know it it's
the most intuitive.

The current order is:
`Català, Deutsch, English, Español, Français, Português (Brasil),
Română, Svenska, Čeština, Русский`

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**< PARTIALLY >**_
iandchasse pushed a commit to iandchasse/crosspoint-reader-minRead that referenced this pull request Feb 27, 2026
## Summary

* **What is the goal of this PR?** (e.g., Implements the new feature for
file uploading.)

Currently we are displaying the languages in the order they were added
(as in the `Language` enum). However, as new languages are coming in,
this will quickly be confusing to the users.

But we can't just change the ordering of the enum if we want to respect
bakwards compatibility.

So my proposal is to add a mapping of the alphabetical order of the
languages. I've made it so that it's generated by the `gen_i18n.py`
script, which will be used when a new language is added.


* **What changes are included?**

Added the array from the python script and changed
`LanguageSelectActivity` to use the indices from there. Also commited
the generated `I18nKeys.h`

## Additional Context

* Add any other information that might be helpful for the reviewer
(e.g., performance implications, potential risks,
  specific areas to focus on).

I was wondering if there is a better way to sort it. Currently, it's by
unicode value and Czech and Russian are last, which I don't know it it's
the most intuitive.

The current order is:
`Català, Deutsch, English, Español, Français, Português (Brasil),
Română, Svenska, Čeština, Русский`

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**< PARTIALLY >**_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants