-
-
Notifications
You must be signed in to change notification settings - Fork 186
Improved I18n::translateCount() method #2432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved I18n::translateCount() method #2432
Conversation
|
This is not how I would suggest addressing #2430. In the case of a Irish, for example, you would need to define literally 11 different forms (one for 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11) to cover all cases, but only five should be necessary. In the case of Polish, there are 4 rules:
In this case, the current mechanics would make it infeasible to cover every number, such as `544B, which should use the second rule. For maximum compatibility, I would suggest that translation strings should take an associative array that can have the following keys:
These are taken from Unicode’s Language Plural Rules. Polish example: months:
one: {{ count }} miesiąc
few: {{ count }} miesiące
many: {{ count }} miesięcy
other: {{ count }} miesiąca # this deals with decimals, and is thus kind of irrelevant today because Kirby’s only accepts integersMy background is from an iOS programming, and the localization toolkit on Apple platforms already has the pluralization rules built into the locales so that feeding it knows which key to select. This is what ngettext() is built to do. I suspect the rules are already available on most PHP system, but if they’re not, there will need to be some kind of function (per Kirby language) that would accept the number to be formatted and that function would return a key such as function polishPlural($n) {
switch ($n) {
case is_float($n):
return 'other';
case 1:
return 'one';
case (in_array($n % 10, [2, 3, 4]) && !in_array($n, [12, 13, 14])):
return 'few';
default:
return 'many';
}
} |
|
Thank you for all details. I understand, for now I wrote the simpler and more comprehensive first version already as a draft. If really needed, more advanced method will be written according to your detailed explanations. |
|
@lukasbestle What do you think about that? |
lukasbestle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is already a huge improvement over the old implementation and still fully backwards-compatible. I like that it is still pretty simple.
ngettext() would require us to rewrite basically the whole translation management and a custom mapping per language would be quite complex and would require the users to actually write those mapping rules even for languages that don't have as complex counting rules.
We could still implement optional custom mapping rules on top like @neildaniels suggested (so that the rules would be used if they are defined and otherwise the key would be determined by the number itself). Basically the pseudo-code would look like this:
if (translation has 'countMapping' key and 'countMapping' is PHP Closure) {
$count = $countMapping($count);
}Still pretty simple.
|
If using the Mozilla documentation as a guide (which is based off the GNU The GNU documentation and Unicode table provide pseudocode (or, actual C code) for these rules (they don't seem to perfectly match up with the Mozilla docs though). Would it be reasonable to have Kirby include support for all 20 rules? |
|
The idea is to support any number of rules the dev needs. You can return any string from the What we can‘t do is to ship a list of all possible languages and their plural rules with Kirby – that would be just too much data for this little feature. That‘s why I think that the custom callback approach is a good and very flexible compromise. |
|
Yes, but what I’m saying is there’s only 20 possible plural rules across all the languages. In other words: there will only ever be 20 possible correct implementations of the callback method. And each of those implementations have fairly trivial rules. What I’m not suggesting is mapping each language to one of those 20 rules. |
|
@neildaniels Yes that would be great. When i think again, this seems to me to be a very comprehensive development for a simple feature. I looked at what the huge framework like 'minutes_ago' => '{1} :value minute ago|[2,*] :value minutes ago',
echo trans_choice('time.minutes_ago', 5, ['value' => 5]);It is already quite comprehensive now. |
|
The issue with implementing the 20 rules is that we can‘t assume that the users know what those rules are for, how they work and which one to choose for their languages. A callback that allows a custom mapping is a bit more work for experienced devs to implement, but easier to understand in general. |
|
@lukasbestle I preferred to use direct callback instead of |
@lukasbestle: But this is something the user has to learn nonetheless, even if there are only 3 rules (which is a very western assumption, by the way). Having a callback with custom mapping is a nightmare for anyone who is not a programmer. The full Laravel syntax in the linked docs is very nice by the way, @afbora. |
|
@nilshoerrmann Yes, I agree that the callback feature is not easy for non-developers. 'apply' => [
'No Apple',
'{{ count }} Apple',
'{{ count }} Apples',
'Many Apples'
] |
|
@afbora I know, but I was referring to the extended plural options discussed in this thread. The above example can only deal with the variants required by most Western European languages. I just wanted to support @neildaniels's request to support all those 20 rules. |
|
The issue I see with the 20 rules is that they don‘t have names and it is therefore not obvious if the language you are working with needs rule 4 or 18 or whatever. That‘s an unnecessary barrier unlike a flexible implementation that is potentially more effort, but can be configured exactly like the user needs it. |
lukasbestle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simplicity of this. However this means that every single translation key would need to be implemented as a Closure. This is an issue if the translations are imported from a translation tool. The code would have to be written by hand for each translation.
I think we could still keep the direct callback as another option for those who want to use it. But we should still have support for a global mapping callback.
Basically it would work like this:
- If the translation itself is a Closure, use its return value directly and ignore the mapping callback (like in your current implementation).
- Otherwise check if a mapping callback is defined and use its return value as the
$count. - Otherwise use the
$countdirectly.
|
Another idea: We could have a There could then be a third-party plugin that implements the 20 language rules. What do you think @nilshoerrmann @neildaniels? |
|
@lukasbestle I tried to implement your global mapping idea 👍 |
f7e21db to
e1f9089
Compare
|
Another suggestion: There should be automatic handling that maps While a custom Take French for example, there are two cases:
For example, given this translation setup for French: 'car' => [
'zero' => 'pas de voitures',
'one' => '{{ count }} voiture',
'other' => '{{ count }} voitures'
],
'day' => [
'one' => '{{ count }} jour',
'other' => '{{ count }} jours'
],
'countMapping' => function ($count) {
switch ($count) {
case 0:
return 'zero';
case 1:
return 'one';
default:
return 'other';
}
}It'll end up yielding the wrong translation for the "0 days" case: echo tc('car', 0); // prints "pas de voitures"
echo tc('day', 0); // prints "0 jours", but should be "0 jour"I would suggest confining the This is how Apple's platforms handle translations, and I that's worked very well in my experience. Thus, I would suggest this: 'car' => [
'zero' => 'pas de voitures',
'one' => '{{ count }} voiture',
'other' => '{{ count }} voitures'
],
'day' => [
'one' => '{{ count }} jour',
'other' => '{{ count }} jours'
],
'countMapping' => function ($count) {
switch ($count) {
case 0:
case 1:
return 'one';
default:
return 'other';
}
}which would print correctly (assuming echo tc('car', 0); // prints "pas de voitures"
echo tc('day', 0); // prints "0 jour" |
|
@neildaniels To be honest, I'm still not convinced that we need to constrain the user and reduce the flexibility by implementing a fixed set of options and even special cases. If you follow our way of doing things, you will know that we strive to make Kirby features as simple as possible. Kirby is not a huge framework made up of specialized components with dozens of options for the tiniest edge-cases – instead it's a tool made up of simple building blocks you can extend yourself if needed. The reason for this goal is that we want Kirby to be simple for beginners and powerful for advanced users at the same time. What you are describing in your last two comments and also during our discussion is behavior that actually needs documentation. Our approach is different: We try to build features in a way that can be understood as quickly as possible with as minimal external information as needed. If you had to actually study how each feature works in detail before you can get started at all, it wouldn't be Kirby anymore. Your suggestions have presuppositions built in (the naming, the |
|
I'm merely giving feedback as someone who has had experience working on internationalization projects on other platforms, including working with professional translators. The suggestions I'm giving are genuinely rooted in how I've seen many people approach localization/translations. Lots of software and translators have adopted Unicode's keywords of I appreciate Kirby's general attitude of "constrain as little as possible," but I don't necessarily agree that internationalization is an area where that is actually beneficial. There is not an infinite number of languages, nor are there an infinite number of language rules. I cannot imagine a scenario where someone would ever take advantage of arbitrary plural rules—other than to avoid reading documentation and hope they get it right. Practically speaking, this isn't an area of "creative freedom" that should be left up developers. The nuances of languages inherently make this kind of work tricky. Yes, there are edge cases. But because of that—in my mind—that's exactly why a framework should be able to accommodate and document those edge cases. If plurals were actually straightforward, Unicode wouldn't have an entire document talking about plurals. My original issue #2430 was rooted in the fact that Kirby's current system legitimately did not work for certain languages. I think that's clear, and thankfully that strictly has been fixed with this PR. Thank you @afbora for this—I appreciate it. I think we need to accept that there will be necessary documentation to use this feature. If my suggestion about
I don't think many developers actually want to think much about localization more than they have to (especially if they're unfamiliar with the languages). I also don't think many professional translators want to have to worry about the format they give their translations in. If translations were something that were entirely and always up to a developer to do, I think I could get behind letting Kirby do as little as possible. But I suspect the reality is: most developers won't be the ones do the translations and when they get an email about Language X has some "weird" rules about plurals, they're going to either:
I think having documentation (that gives a primer about why this method actually exists and points to Unicode's Plural Rules) isn't a concession or sacrifice. It's a strong hint that thought went into providing a useful implementation that doesn't require you to worry about the edge cases. I will agree that the My takeaway from some of the comments is that I should just go build my own utopian function that does everything I want. That utopian version would have like 3 more lines of code and have the language rules baked into it. But, if that's the outcome, so be it. |
|
Don‘t get me wrong: There should definitely be documentation for this feature (and there will be). 🙂 @bastianallgeier @distantnative What do you think about the different ways of implementing this? |
|
As an aside: I went through Unicode's list and made PHP implementations for nearly every language.
I think Kirby could integrate in these rules and free developers from ever having to write this boilerplate code. But, even if they did, you'd end up with basically the same logic. Also: consider if a plugin starts to offer translations with plurals variations. Even with this current PR, you have to hope that the plugin offering the translations will be compatible with a site's plural rules closure. In that sense, there necessarily will become a de facto way to handle plurals. |
|
I'm trying to take a step back here: But should we really reinvent the wheel for all of this? I would think that this could be a very useful case for the few times where we take a good library on board. So I would be much more interested in figuring out if there is a good library that fulfils our/all these requirements. |
|
If i18n will be refactored and this will not happen anytime soon, we can take the first commit of PR as step one 🤔 |
|
Is everyone aware of this plugin by Oblik Studio: |
|
I was not, but that looks perfect. |
|
@nilshoerrmann Oh, that's awesome! I didn't know there was already a plugin for this. A plugin makes a lot more sense in my opinion. So maybe we should simplify this PR by reducing it to the useful smaller improvements. Which would mean: Remove the count mapping stuff, but keep the rest (translation callbacks, support for the count placeholder for simple strings and flexible number of keys). What do you think @neildaniels @afbora? |
|
@lukasbestle #2432 (comment) As I said before, I think we should continue by simplifying it like your said 👍. We can take the first two commits. |
|
Some of the improvements in the last commit still apply. I will update and finalize the PR this weekend. :) |
|
Sounds good to me. |
|
Sorry for skipping the long discussion. What's the state of this? @lukasbestle? |
|
@bastianallgeier A general disagreement about how much of singular-plural rules for all languages should be integrated into the core, whether we should just offer a general callback method, or whether this should rather be handled by a good library (maybe as part of a general i18n refactoring). |
|
No, I think we have agreement now. I just haven't found the time so far to finalize this. But it's on my list for the next days. |
e1f9089 to
b8634b5
Compare
lukasbestle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR now only includes the minor improvements we have discussed. All more complex features are covered by the kirby-plurals plugin.
|
Maybe it would be interesting to ask @OblikStudio for a cookbook entry so the plugin gets a bit more coverage? |
Describe the PR
I18n::translateCount()I18n::translateCount()I18n::translateCount()now replaces the{{ count }}even if just a single translation string is definedRelated issues
I18n::translateCount()does not work with languages with more than 3 cardinal forms #2430I18n::translateCount()doesn't replace {{ count }} for single-string values #2431Ready?
composer fix