Implement simple lexicograpic sorting#271
Implement simple lexicograpic sorting#271gunchleoc wants to merge 9 commits intosingularity:masterfrom
Conversation
This adds a new dependency on unidecode.
nthykier
left a comment
There was a problem hiding this comment.
Thanks for looking at improving the locale support in singularity; you are really going above and beyond on this. :)
I have added to remarks for changes that I would like to see resolved before we merge this. But otherwise, solid work. :)
README.txt
Outdated
| * Python 3 (https://python.org/download/) | ||
| * pygame (https://www.pygame.org/download.shtml) | ||
| * NumPy (https://www.scipy.org/install.html) | ||
| * unidecode (https://pypi.org/project/Unidecode/) |
There was a problem hiding this comment.
I would prefer that we use the built-in locale.strxfrm (or its counterpart locale.strcoll) by default - possibly ICU as an optional alternative, which is allegedly the only truly reliable way to sort strings locale-aware (notably, there is trouble with locale.strxfrm on Mac OS).
Example of how to use ICU is at: https://stackoverflow.com/questions/11121636/sorting-list-of-string-with-specific-locale-in-python
There was a problem hiding this comment.
Thanks for those libraries - I found this difficult to search for, since must search results for "lexicographic sorting" will just give you sorting by unicode code point, which is fine for English.
There was a problem hiding this comment.
I have switched to ICU. Note that none of the libraries (I tested all of them) can handle correct sorting order for German umlauts, so I have added some special code for that.
There was a problem hiding this comment.
Thanks for looking at it. It looks like ICU is now mandatory rather than optional as I had hoped for. That is the only bit that is missing for me. :)
There was a problem hiding this comment.
I have added the packaging info now - I already had ICU installed, so I didn't notice.
This branch can't be tested right now because I merged master, which is broken.
|
gunchleoc:
@gunchleoc commented on this pull request.
> @@ -23,8 +23,9 @@ You will need to install the following software to play Endgame: Singularity:
* Python 3 (https://python.org/download/)
* pygame (https://www.pygame.org/download.shtml)
* NumPy (https://www.scipy.org/install.html)
+ * unidecode (https://pypi.org/project/Unidecode/)
I have added the packaging info now - I already had ICU installed, so I didn't notice.
This branch can't be tested right now because I merged master, which is broken.
The master branch is fixed now; apologies for the noise.
|
|
No problem, things happen. This branch should be good to go now if you're OK with using ICU. I don't have a Mac though to test if the installation of ICU for Python works. |
|
Thanks. AFAICT, there are two missing bits:
|
|
I'm not fond of conditional stuff, since it will give us 2 different code paths and could lead to difficulty with bug fixing. I could try switching to |
|
Ok. :) Ping me when you have had time to migrate to |
@nthykier Done :) Best squash the commits when merging. |
This adds a new dependency on unidecode.
Real lexicographic sorting is locale-dependent, but I couldn't find a library that can do that. There is https://github.com/miurahr/unihandecode but that's for Chinese script only and will crash on other locales.
Screenshot of the problem - "Ìsleachadh" on the knowledge screen should be sorted next to "Innleachd", not appear on the bottom:
And a screenshot with this branch:
